What is the difference between RAG and fine-tuning?
RAG adds knowledge at query time by retrieving documents into the prompt. Fine-tuning bakes behavior or knowledge into the model weights through training. RAG suits facts that change often, and fine-tuning suits stable behavior and format. The decision-framework post covers how to choose.
Why are my top-k retrieval results wrong?
Vector similarity ranks by semantic closeness, which is not the same as relevance to the question. The fix is a reranking step that re-scores candidates with a model built for relevance. The reranking post shows the measured difference.
What is hybrid search?
Hybrid search combines keyword retrieval (BM25) with dense vector search and fuses the rankings, usually with reciprocal rank fusion. It catches exact terms that embeddings miss and meaning that keywords miss.
How do I evaluate a RAG system?
Measure retrieval and generation separately. Track recall and precision on retrieval, and faithfulness and answer relevance on generation. The evaluation-metrics post covers which numbers actually predict production quality.