Compare RAG and AI engineering concepts

RAG architecture comparisons

When to retrieve external knowledge, when to adapt weights, and how agents or protocols fit around grounding.

RAG vs fine-tuning — when to use each
RAG updates what the model can read at query time when facts change; fine-tuning updates how the model behaves when vocabulary and tone are stable. Pick based on whether your failure mode is stale knowledge or wrong style — not which demo sounds smoother.
RAG vs semantic search — retrieval-only vs grounded generation
Semantic search returns ranked passages by embedding similarity. RAG adds chunking strategy, context assembly, generation, and faithfulness checks — search is one stage, not the product.
RAG vs MCP — context retrieval vs tool protocol
RAG is how you ground answers on documents. MCP standardizes how hosts connect models to tools and data sources — it does not define chunking, recall, or faithfulness metrics.
RAG vs AI agents — knowledge grounding vs action planning
Agents plan and execute multi-step workflows with tools. RAG measures whether the right text was retrieved before any step speaks. Agents without retrieval eval often hide missing facts behind fluent tool narration.
RAG vs agentic RAG — static retrieval vs orchestrated retrieval loops
Classic RAG runs one retrieval pass (sometimes with rerank) then generates. Agentic RAG lets an agent plan queries, call tools, and iterate retrieval before answering — useful for multi-hop questions but harder to evaluate and observe.
RAG evaluation vs observability — offline benchmarks vs production traces
Evaluation scores whether retrieval and answers meet benchmarks on fixed datasets. Observability logs what happened on live traffic — which chunks were retrieved, latency, and faithfulness signals. Teams need both: eval catches regressions before release; observability explains failures users actually hit.

Retrieval comparisons

Chunk boundaries, reranking, and what enters the index versus what gets promoted after retrieval.

Chunking vs reranking — index boundaries vs post-retrieval ordering
Chunking determines which text exists in the index at all. Reranking only reorders candidates already retrieved. If required facts never appear in any chunk, reranking cannot recover them.

Framework comparisons

Orchestration libraries versus document-centric indexing and query workflows.

LangChain vs LlamaIndex — orchestration vs indexing workflow
LangChain emphasizes composable chains, tools, and agent wiring across providers. LlamaIndex emphasizes ingestion, indices, and query interfaces over documents. Neither replaces chunking decisions or recall evaluation.

Infrastructure comparisons

Managed vector search versus self-hosted hybrid database tradeoffs under RAG workloads.

Pinecone vs Weaviate — managed vector infra vs hybrid database tradeoffs
Pinecone optimizes for managed approximate nearest-neighbor search with minimal ops. Weaviate offers schema, modules, and deployment flexibility including self-host. RAG quality still depends on chunking and recall tests on your corpus — not vendor ANN benchmarks alone.

Compare RAG and AI engineering concepts

Priority RAG comparisons

RAG architecture comparisons

Retrieval comparisons

Framework comparisons

Infrastructure comparisons

Related expert search queries