In standard vector search, you are comparing a short, interrogative user query ('How do I configure Redis?') against long, declarative document chunks. These two text formats live in different parts of the embedding space, causing low similarity scores. Hypothetical Document Embeddings (HyDE) fixes this asymmetry. When a user asks a question, an LLM is prompted to write a fake, hypothetical answer based purely on its internal weights. This hallucinated document is mathematically much closer to the actual, factual documents stored in the vector database. The system embeds the fake document, retrieves the factual documents, and then generates the final, grounded response.
How It Works
- Generation: The user query is sent to an Instruct LLM.
- Hallucination: The LLM generates a hypothetical, plausible-sounding (but potentially factually incorrect) document that answers the query.
- Embedding: The hypothetical document is encoded into a dense vector.
- Retrieval: The vector database retrieves the actual, factual documents that are semantically closest to the hallucinated text.
Common Use Cases
- Searching dense technical documentation where the user's vocabulary doesn't match the documentation's vocabulary.
- Zero-shot domain retrieval where the embedding model hasn't been fine-tuned on the specific jargon.