Automation

AI

Test Automation

MCP Server Security Risks for AI Coding Agents

2 min read

MCP makes agents useful by connecting them to tools and data. That is also why MCP changes the security model. A tool is not just context. A tool is capability.

The core question is not “do I trust the model?” The sharper question is “what can the model reach when it is wrong or manipulated?”

What MCP adds

Microsoft describes MCP as a standardized interface for connecting LLMs with external data sources and tools. That standardization is valuable. It reduces custom integration work and makes tool ecosystems portable. But standardized access also standardizes attack paths if teams skip security boundaries.

An AI coding agent with MCP tools may read repositories, query systems, browse documents, send messages, or modify records. If those tools are over-permissioned, the model’s mistake becomes an operational action.

Indirect prompt injection

Indirect prompt injection happens when malicious instructions are embedded in external content: a web page, document, issue, email, dependency file, or retrieved record. The model reads the content and treats the attacker’s instruction as relevant context.

MCP makes this worse when tools bring untrusted content into the same context window as trusted instructions. The model cannot naturally enforce your trust boundary. The harness must.

Tool poisoning and rug pulls

Microsoft highlights tool poisoning: malicious instructions hidden in tool metadata, especially descriptions that models use to decide which tool to call. A rug pull is nastier: a tool that looked safe during approval changes behavior or metadata later.

This is why “I approved that MCP server last month” is not enough. Tool definitions and permissions need ongoing verification.

Controls that actually help

Least privilege per tool: read-only by default, write tools separated.
Approval gates for external actions and destructive changes.
Tool metadata review and change detection.
Secrets isolation: tools should not expose credentials casually.
Audit logs for tool calls, arguments, outputs, and denied actions.
Sandboxed execution for untrusted repositories or documents.
Prompt shielding, delimiters, and explicit trusted/untrusted data boundaries.

Security rule

Any agent with private data access, untrusted input, and external action capability needs serious controls. That combination is where incidents breed.

Sources and further reading

Microsoft, Protecting against indirect injection attacks in MCP
OpenTelemetry, AI Agent Observability
Dhiraj Das, Testing Cursor, Claude Code, and Codex Workflows Safely

About the Author

Dhiraj Das | Automation Consultant | 10+ years building automation systems that expose failures, reduce flakiness, and make complex workflows repeatable. He now applies that discipline independently to AI-agent validation, run replay, LLM testing, and postmortems.

Creator of many open-source tools solving what traditional automation can't: waitless (flaky tests), sb-stealth-wrapper (bot detection), selenium-teleport (state persistence), selenium-chatbot-test (AI chatbot testing), lumos-shadowdom (Shadow DOM), and visual-guard (visual regression).

Share this article:

What MCP adds

Indirect prompt injection

Tool poisoning and rug pulls

Controls that actually help

Sources and further reading

About the Author

You might also like

How to Test AI Agents: A Practical Harness-Based Guide

AI Agent Reliability Checklist for Engineering Teams

How to Debug AI Coding Agents When They Lie About Success