GenAI Systems Lab Open interactive version →
RAG & Retrieval 10 min read

Graph RAG: When Vector Search Isn't Enough

Why multi-hop queries break standard vector RAG, and how knowledge graphs plus traversal solve what embedding similarity cannot. Entity extraction, graph construction, multi-hop path traversal, hybrid retrieval architecture, and when the production cost is justified.

Vector search retrieves documents that are semantically similar to a query. This works for most retrieval tasks. It fails for one specific class of problem that shows up constantly in enterprise AI: multi-hop relational queries. Questions like 'which investors funded companies that both use RAG and fine-tuning?' or 'which compliance policies apply to regulated customers who bought product X?' require following relationships across entities, not finding similar text. Graph RAG is the architecture that solves this. Senior AI engineer interviews now test it directly.

The failure mode vector search can't fix

The problem isn't retrieval precision. You could have perfect recall — every relevant document in the top-20 — and still be unable to answer a multi-hop query. The reason: the relationship only emerges from connecting entities across documents. No single chunk contains 'Sequoia invested in OpenAI AND OpenAI uses RAG.' The connection requires traversal.

Common interview trap: candidates suggest that better chunking, higher top-k, or a reranker will fix multi-hop failures. These help with retrieval precision — they don't solve the fundamental problem that cross-document relationships don't exist in embedding space.

What a knowledge graph is

A knowledge graph represents information as entities (nodes) and relationships (edges). An entity is a named thing — a company, person, technology, regulation. A relationship is a typed, directional connection — 'Anthropic uses RAG', 'Sequoia invested_in Anthropic', 'RAG requires vector_search'. Graph RAG builds this structure from your document corpus through entity extraction and relationship parsing, then stores it in a graph database (Neo4j, Amazon Neptune, or a property graph layer over a relational DB).

Multi-hop traversal

A multi-hop query follows a chain of relationships to reach an answer. For 'which investors backed companies that use RAG?', the traversal is: find RAG node → follow 'uses' edges backward → find company nodes (Anthropic, OpenAI) → follow 'invested_in' edges backward → find investor nodes → intersect results. This is a 2-hop traversal. Production queries can require 3-5 hops, especially in compliance, supply chain, and organizational hierarchy use cases.

Query typeHopsExample
Single-hop factual1'What does Anthropic build?' → company → products
Two-hop relational2'Which investors back RAG companies?' → tech → company → investor
Three-hop compliance3'Which policies apply to regulated enterprise customers using X?' → customer → segment → regulation → policy
Cross-domain inference4+'What are the second-order competitive risks of Y?' — multiple entity chains

The hybrid Graph + Vector architecture

Pure graph traversal has a blind spot: it only knows what was explicitly extracted at graph construction time. Unstructured knowledge — nuance, context, implicit relationships — lives in the raw documents. The production architecture combines both: vector search for high-recall initial retrieval, graph traversal for relationship resolution, LLM synthesis for the final answer.

Production failure modes

Graph RAG costs 2-5x more than standard vector RAG — graph construction, entity extraction, and traversal add both upfront and per-query overhead. It is only justified when your query distribution contains a meaningful fraction of multi-hop relational questions. Most customer support bots don't need it. Compliance, supply chain, competitive intelligence, and knowledge management systems often do.

When to use it

Use it whenSkip it when
Queries span multiple entity types with named relationshipsQueries are single-hop factual or similarity-based
Data has explicit relational structure (org charts, supply chains, compliance maps)Data is unstructured text without clear entity boundaries
Explainability matters — you need to show the reasoning pathSpeed is critical — graph traversal adds latency vs pure vector
The relationship graph is relatively stable (slow-changing domain)The knowledge base updates in real-time (graph staleness becomes unmanageable)

Try it interactively

GenAI Systems Lab is a free platform for AI engineers — configure real failure modes, break things, and build the judgment that gets you hired.

Open GenAI Systems Lab →