MCP makes agents useful by connecting them to tools and data. That is also why MCP changes the security model. A tool is not just context. A tool is capability.
The core question is not “do I trust the model?” The sharper question is “what can the model reach when it is wrong or manipulated?”
What MCP adds
Microsoft describes MCP as a standardized interface for connecting LLMs with external data sources and tools. That standardization is valuable. It reduces custom integration work and makes tool ecosystems portable. But standardized access also standardizes attack paths if teams skip security boundaries.
An AI coding agent with MCP tools may read repositories, query systems, browse documents, send messages, or modify records. If those tools are over-permissioned, the model’s mistake becomes an operational action.
Indirect prompt injection
Indirect prompt injection happens when malicious instructions are embedded in external content: a web page, document, issue, email, dependency file, or retrieved record. The model reads the content and treats the attacker’s instruction as relevant context.
MCP makes this worse when tools bring untrusted content into the same context window as trusted instructions. The model cannot naturally enforce your trust boundary. The harness must.
Tool poisoning and rug pulls
Microsoft highlights tool poisoning: malicious instructions hidden in tool metadata, especially descriptions that models use to decide which tool to call. A rug pull is nastier: a tool that looked safe during approval changes behavior or metadata later.
This is why “I approved that MCP server last month” is not enough. Tool definitions and permissions need ongoing verification.
Controls that actually help
- Least privilege per tool: read-only by default, write tools separated.
- Approval gates for external actions and destructive changes.
- Tool metadata review and change detection.
- Secrets isolation: tools should not expose credentials casually.
- Audit logs for tool calls, arguments, outputs, and denied actions.
- Sandboxed execution for untrusted repositories or documents.
- Prompt shielding, delimiters, and explicit trusted/untrusted data boundaries.
Sources and further reading
- Microsoft, Protecting against indirect injection attacks in MCP
- OpenTelemetry, AI Agent Observability
- Dhiraj Das, Testing Cursor, Claude Code, and Codex Workflows Safely

