Writing
Technical articles and insights on AI, machine learning, and vector databases.
Technical deep dives and practical guides on RAG, vector databases, embeddings, LLM fine-tuning, and multimodal AI. Published on the Weaviate and Together AI engineering blogs.
Fine-Tuning Open LLM Judges to Outperform GPT-5.2
Open-source LLM judges fine-tuned with DPO can outperform GPT-5.2 at evaluating model outputs. We trained GPT-OSS 120B on 5,400 preference pairs to beat GPT-5.2's accuracy — delivering superior performance at 15x lower cost and 14x faster speeds.
How to Evaluate and Benchmark Large Language Models (LLMs)
A comprehensive guide to evaluating and benchmarking LLMs, covering key metrics, methodologies, and best practices for model selection.
Dynamic AI Agent Testing for the Real World with Collinear Simulations and Together Evals
How to use Collinear Simulations with Together Evals for dynamic, real-world testing of AI agents in production scenarios.
Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts, Enhanced Hugging Face Integrations
Major upgrades to the Together AI fine-tuning platform including support for larger models, longer context windows, and deeper Hugging Face integration.
Together Evaluations: Benchmark Models for Your Tasks
Introducing a comprehensive evaluation framework for benchmarking AI models across various tasks and domains.
Back to The Future: Evaluating AI Agents on Predicting Future Events
A comprehensive benchmark for evaluating AI agents on their ability to predict future events and outcomes.
From Zero to One: Building An Autonomous Data Scientist Agent
A technical deep dive into building an autonomous AI agent capable of performing data science tasks from scratch.
Direct Preference Optimization: A Technical Deep Dive
An in-depth exploration of Direct Preference Optimization techniques for improving AI model alignment and performance.
Continued Fine‑tuning of LLMs: A Technical Deep Dive
Comprehensive guide to continued fine-tuning techniques for large language models, covering advanced optimization strategies.
Open Deep Research
Exploring the principles and practices of open research in deep learning and artificial intelligence.
Long Context Fine‑Tuning: A Technical Deep Dive
Advanced techniques for fine-tuning language models with extended context windows, enabling better long-form understanding.
Multimodal Document RAG with Llama 3.2 Vision and ColQwen2
Building advanced retrieval-augmented generation systems that can process both text and visual document content using state-of-the-art models.
Advanced RAG Techniques
Learn how to improve the individual indexing, retrieval and generation parts of your RAG pipeline!
OpenAI's Matryoshka Embeddings
How to use OpenAI's embedding models trained with Matryoshka Representation Learning in a vector database like Weaviate
Step-by-Step Guide to Choosing the Best Embedding Model for Your Application
How to select an embedding model for your search and retrieval-augmented generation system.
32x Reduced Memory Usage With Binary Quantization
In-depth technical breakdown of how binary quantization works and how to use it in Weaviate.
Accelerating Vector Search up to +40% with Intel's latest Xeon CPU - Emerald Rapids
Boosting Weaviate using SIMD-AVX512, Loop Unrolling and Compiler Optimizations
Multimodal Retrieval-Augmented Generation (RAG)
Learn how to build Multimodal Retrieval Augmented Generation (MM-RAG) systems that combine text, images, audio, and video. Discover contrastive learning, any-to-any search with vector databases, and practical code examples using Weaviate and OpenAI GPT-4V.
How to Reduce Memory Requirements by up to 90%+ using Product Quantization
The details behind how you can compress vectors using PQ with little loss of recall!
A Gentle Introduction to Vector Databases
What is a Vector Database? Explanation of core concepts, such as vector embeddings, vector search, and vector indexing
Multimodal Models
ML Models that can see, read, hear and more!
Private LLM
A discussion on data privacy and privacy-preserving machine learning for LLMs
How to ChatGPT Plugin
A show-and-tell of how we created the Weaviate Retrieval Plugin for ChatGPT
Weaviate Retrieval Plugin
Learn how you can connect Weaviate to ChatGPT to generate customized responses.
What are LLMs
A gentle introduction to Large Language Models (LLMs) - how they work and what they learn.
How AI Creates Art
Machine learning models can create beautiful and novel images. Learn how Diffusion Models work and how you could make use of them.
Vector Embeddings Explained
Get an intuitive understanding of what exactly vector embeddings are, how they're generated, and how they're used in semantic search.
The Details Behind the Sphere Dataset in Weaviate
Learn about the hardware, software and performance metric specifications behind our ~1B object import of the Sphere dataset into Weaviate.
The Sphere Dataset in Weaviate
Learn how to import and query the Sphere dataset in Weaviate!