What do practitioners agree on?

A tool exposed via MCP may still call a vector index built with RAG practices.

What do practitioners agree on?

Both appear in agent stacks that fetch context before answering.

What failure mode should teams watch for?

Tool calls succeed while the underlying index never returns required facts.

Technical authority · When to use

When to use RAG vs MCP (Model Context Protocol)

Name: Building with MCP and the Claude API
Uploaded: 2026-05-20T06:37:11.105Z
Channel: Anthropic · End-to-end RAG architecture · 0:24
Description: Hey, I'm Michael. I'm an engineer on the API team here at Anthropic. I'm John and I work on the Model Context Protocol team

Choose RAG design when answers must come from your corpus. Add MCP when you need portable tool wiring across clients — still measure retrieval behind each tool.

solid· 67

Authority index

Short answer

Choose RAG design when answers must come from your corpus. Add MCP when you need portable tool wiring across clients — still measure retrieval behind each tool.

Clearest explanation

solid· 67

Canonical expert clip

Chosen for clarity and how directly it answers the question — not for views or hype.

Best expert explanation

"Hey, I'm Michael. I'm an engineer on the API team here at Anthropic. I'm John and I work on the Model Context Protocol team"

Anthropic · End-to-end RAG architecture · 0:24

Start with the clearest explanation

Opens a little earlier so you catch the setup

Open clip on YouTube

Share this moment

Share formats

Open indexed moment page →

Why this clip matters

Choosing between RAG and MCP (Model Context Protocol) changes your eval plan and ops surface — use practitioner tradeoffs before committing.

Source credibility

Anthropic

Building with MCP and the Claude API

0:24

Practitioner explanation from an indexed engineering video — verify claims against your stack.

Decision rule

Choose RAG design when answers must come from your corpus. Add MCP when you need portable tool wiring across clients — still measure retrieval behind each tool.

Choose RAG when

• Primary UX is Q&A over private documents with citations.
• Success metric is recall of required facts per question set.
• You are still designing chunking, embedding, and faithfulness eval for a corpus.

Choose MCP (Model Context Protocol) when

• You ship multiple clients that must call the same tool surface.
• Integrations change often and need a stable protocol boundary.
• Tool wiring is the bottleneck — retrieval quality is already measured.

Production tradeoffs

• How much logic belongs in tools versus the host application.
• Whether document stores should be first-class MCP resources everywhere.

Failure modes

• Tool calls succeed while the underlying index never returns required facts.
• Logging tool JSON but not passages shown to the generator.

Implementation mistakes

• Prioritizing protocol roadmap before baseline recall metrics exist.
• Exposing raw DB tools without retrieval guardrails for document Q&A.

Related comparisons

Architecture visual

MCP orchestration with optional RAG retriever tool

Semantic cluster

Semantic cluster: when to use rag vs mcp

Related concepts

• retrieval-augmented generation
• chunking
• embeddings
• reranking
• faithfulness eval
• recall@k

Common misconceptions

• Prioritizing protocol roadmap before baseline recall metrics exist.
• Exposing raw DB tools without retrieval guardrails for document Q&A.

Failure conditions

• Tool calls succeed while the underlying index never returns required facts.
• Logging tool JSON but not passages shown to the generator.

Tradeoffs

• RAG optimizes for one failure mode; MCP (Model Context Protocol) optimizes for another.
• Stricter faithfulness checks can reduce answer fluency.

When NOT to use

• Do not force MCP (Model Context Protocol) when required facts are not in the corpus.
• Do not conflate tool protocol success with retrieval quality.

People also compare

Authoritative external references

Model Context Protocol specification
Anthropic
Client/server/tool protocol for model hosts.
Anthropic MCP announcement
Anthropic
Why MCP standardizes tool and data connections.
OpenAI retrieval and embeddings guide
OpenAI
Grounding patterns and retrieval APIs.

What experts agree on

Practitioner themes behind this authority page — not a poll or quote list.

•A tool exposed via MCP may still call a vector index built with RAG practices.
•Both appear in agent stacks that fetch context before answering.
•Retrieval quality dominates many production failures; fixing prompts alone rarely fixes wrong or missing chunks.
•Chunking, embedding model choice, and metadata boundaries materially affect what the model can see.
•Promoting the best passages after first-stage retrieval (reranking or hybrid scoring) often matters more than marginal prompt tweaks.

What experts disagree on

Open engineering debates — compare indexed explanations before you commit to an architecture.

How much logic belongs in tools versus the host application.
How much logic belongs in tools versus the host application.
Whether document stores should be first-class MCP resources everywhere.
Whether document stores should be first-class MCP resources everywhere.

Common mistakes

•Tool calls succeed while the underlying index never returns required facts.
•Logging tool JSON but not passages shown to the generator.
•Prioritizing protocol roadmap before baseline recall metrics exist.
•Exposing raw DB tools without retrieval guardrails for document Q&A.
•Treating RAG as a magic prompt wrapper without measuring retrieval recall on real questions.
•Skipping chunking strategy because the context window is large.

Implementation tradeoffs

•Chunk boundaries: Smaller chunks improve precision but fragment context; larger chunks improve local context but dilute relevance signals.
•Reranking: Cross-encoder or LLM rerankers improve top-k quality at higher latency and inference cost.

Themes repeated across indexed engineering talks and practitioner writeups — not a survey, vote count, or attributed quote roundup.

Build a RAG investigation

Save expert explanations into one investigation, compare voices, and export a shareable research brief on this device.

Start research workspace View saved investigations

Internal links

Continue with the product

Weekly digest of new expert moments

Programmatic access (waitlist)

Curated engineering collections

Browse hand-picked RAG and retrieval moments — same indexed corpus, organized for deep dives.

Open RAG explanation collection →

Save clips to an investigation

Build a private notebook of timestamped moments while comparing RAG architecture choices.

Open investigations →View saved clips →

When to use RAG vs MCP (Model Context Protocol)

Short answer

Canonical expert clip

Why this clip matters

Decision rule

Choose RAG when

Choose MCP (Model Context Protocol) when

Production tradeoffs

Failure modes

Implementation mistakes

Related comparisons

Architecture visual

Semantic cluster: when to use rag vs mcp

Related concepts

Common misconceptions

Failure conditions

Tradeoffs

When NOT to use

People also compare

Authoritative external references

What experts agree on

What experts disagree on

How much logic belongs in tools versus the host application.

Whether document stores should be first-class MCP resources everywhere.

Common mistakes

Implementation tradeoffs

Build a RAG investigation

Request API access

Save this research workflow

Internal links

Continue with the product

Curated engineering collections

Save clips to an investigation