Comparison of BM25 keyword search and embedding-based approaches
Dense vs. Sparse Retrieval compares two paradigms: Embedding-based search (Dense) vs. token matching (Sparse/BM25). Modern RAG systems combine both for optimal results.
Step 4/5 in Chapter 2 "Modern Architecture Variants"
Application of sparse concepts in RAG. Shows how retrieval optimization affects inference quality.
Hybrid Retrieval (e.g., Cohere Rerank) combines BM25 + Embeddings for best results. 30% BM25 + 70% Dense is a proven production standard.
Keyword-based approach based on word frequencies and positions. No ML training needed, extremely fast.
Embedding-based, understands meaning. Better at paraphrases and semantically similar documents.
BM25 is 10-100× faster, but Dense has better semantic quality. Choose based on use case.
Combination: 30% BM25 + 70% Dense. Best balance between speed and accuracy in production.
BM25 scales linearly, Dense requires Vector-DB (FAISS, Milvus). For large corpora: Hybrid or Dense only.
Google Search: BM25 as filter, then ranker. RAG systems: Dense Retrieval, BM25 as fallback.