MCP Gateway — Who Controls Your AI Agent's Tool Calls?

MCP Gateway — Who Controls Your AI Agent's Tool Calls?

MCP has crossed 97 million monthly downloads and become the de facto standard, but there is no control layer governing which tools agents call and how often. The MCP Gateway pattern addresses this gap.

One of my Claude Code sessions is connected to 7 MCP servers. GitHub, Notion, Google Calendar, Gmail, Chrome DevTools, NotebookLM, and Telegram. This agent can read my emails, create calendar events, edit Notion pages, and open Chrome tabs.

Who’s watching all of this?

Nobody. At least not in my local setup.

MCP Succeeded. The Security Layer Hasn’t

MCP’s (Model Context Protocol) growth is staggering. Combined Python + TypeScript SDK monthly downloads have surpassed 97 million, with Anthropic, OpenAI, Google, Microsoft, and Amazon all supporting it. Created by Anthropic in late 2024 and donated to the Linux Foundation’s AAIF in December 2025, it has become the de facto standard for “how AI agents call external tools.”

The problem is that this protocol focuses on connectivity, not control.

When you create an MCP server, you define tools, and clients call those tools. Authentication? OAuth 2.1 made it into the spec. But policy-level concerns like “how many times can this agent call this tool per day” or “tools returning sensitive data must not be called without approval” aren’t part of the MCP protocol itself. That’s left to the implementer.

That’s where the MCP Gateway concept comes in.

What Is an MCP Gateway

Think API Gateway. Just like putting a reverse proxy in front of your backends with Kong or AWS API Gateway, you put a proxy in front of your MCP servers.

Agent → MCP Gateway → MCP Servers

What the Gateway does:

  • Authentication/Authorization: Which agents can access which tools
  • Rate Limiting: Throttle tool call frequency
  • Audit Logging: Record who called what tool and when
  • Policy Enforcement: Certain tools require human approval before execution
  • Traffic Routing: Forward requests to the appropriate MCP server

I tested this in my local environment with a simple setup — a Node.js MCP proxy sitting between Claude Code and the actual MCP servers.

// Simplest MCP Gateway skeleton
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { CallToolRequestSchema } from "@modelcontextprotocol/sdk/types.js";

const gateway = new Server({ name: "mcp-gateway", version: "0.1.0" }, {
  capabilities: { tools: {} }
});

// Policy engine — allow/deny calls here
const policy = {
  "gmail_read_message": { rateLimit: 10, requireApproval: false },
  "gmail_create_draft": { rateLimit: 5, requireApproval: true },
  "gcal_delete_event": { rateLimit: 2, requireApproval: true },
  "notion-update-page": { rateLimit: 20, requireApproval: false },
};

const callCount: Record<string, number> = {};

gateway.setRequestHandler(CallToolRequestSchema, async (request) => {
  const toolName = request.params.name;
  const rule = policy[toolName];
  
  // Rate limit check
  callCount[toolName] = (callCount[toolName] || 0) + 1;
  if (rule && callCount[toolName] > rule.rateLimit) {
    return {
      content: [{ type: "text", text: `Rate limit exceeded for ${toolName}` }],
      isError: true,
    };
  }
  
  // Block tools requiring approval
  if (rule?.requireApproval) {
    console.error(`[GATEWAY] Approval required for: ${toolName}`);
    // In practice, send an approval request via Slack/Telegram here
  }
  
  // Audit log
  console.error(`[AUDIT] ${new Date().toISOString()} | ${toolName} | args: ${JSON.stringify(request.params.arguments)}`);
  
  // Forward to actual MCP server (omitted here)
  return await forwardToUpstream(toolName, request.params.arguments);
});

Is this production-ready? Honestly, not yet. But the core idea comes through clearly enough. Every agent tool call should pass through a single point, and that single point needs to enforce policies.

Running It Revealed What’s Missing

I plugged the code above into Claude Code and ran it. Short answer — it doesn’t work as-is.

The first problem is tool list synchronization. For the Gateway to intercept CallToolRequest, it first needs to tell the client (Claude Code) “here are the tools I have.” The code above has no listTools handler. You need to connect to upstream MCP servers, fetch their tool lists, and forward them to the client.

import { ListToolsRequestSchema } from "@modelcontextprotocol/sdk/types.js";

gateway.setRequestHandler(ListToolsRequestSchema, async () => {
  const upstreamTools = await fetchToolsFromUpstream();
  return { tools: upstreamTools };
});

This gets things working, but when you have multiple upstream servers, tool names can collide. In my environment, Gmail and Google Calendar both expose generic names like list, so I had to add namespacing.

The second problem is rate limit lifetime. The callCount in the code above lives in memory. Restart the process and the count resets to zero. Claude Code spawns a new MCP server per session, so the limit resets every time you start a new session. A “10 per day” policy simply can’t be enforced this way.

Third, printing requireApproval to console.error was completely useless. Nobody’s watching stderr in real time. Actually getting approval means sending a request to an external channel (Telegram, Slack) and blocking until a response comes back — and implementing async waiting in stdio-based MCP is quite involved.

I thought about what solves all three at once, and at least for the first two, the answer is simple: write audit logs to SQLite instead of files or memory.

Audit Logs in SQLite

Storing audit logs in a SQLite table instead of piping them through console.error also solves the rate limit lifetime problem. The DB persists across process restarts.

import Database from "better-sqlite3";

const db = new Database("mcp-audit.db");

db.exec(`
  CREATE TABLE IF NOT EXISTS audit_log (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TEXT DEFAULT (datetime('now')),
    tool_name TEXT NOT NULL,
    args TEXT,
    result_status TEXT DEFAULT 'ok',
    latency_ms INTEGER,
    blocked INTEGER DEFAULT 0,
    block_reason TEXT
  )
`);

const insertLog = db.prepare(`
  INSERT INTO audit_log (tool_name, args, result_status, latency_ms, blocked, block_reason)
  VALUES (?, ?, ?, ?, ?, ?)
`);

const countToday = db.prepare(`
  SELECT COUNT(*) as cnt FROM audit_log
  WHERE tool_name = ? AND timestamp > datetime('now', '-1 day') AND blocked = 0
`);

Replace the in-memory callCount dictionary with the countToday query, and you can accurately track “how many times Gmail read was called today” across sessions. The Gateway handler becomes:

gateway.setRequestHandler(CallToolRequestSchema, async (request) => {
  const toolName = request.params.name;
  const rule = policy[toolName];
  const start = Date.now();

  if (rule) {
    const { cnt } = countToday.get(toolName) as { cnt: number };
    if (cnt >= rule.rateLimit) {
      insertLog.run(toolName,
        JSON.stringify(request.params.arguments),
        "blocked", 0, 1, "rate_limit");
      return {
        content: [{ type: "text",
          text: `Rate limit exceeded: ${toolName} (${cnt}/${rule.rateLimit} today)` }],
        isError: true,
      };
    }
  }

  const result = await forwardToUpstream(
    toolName, request.params.arguments);
  const latency = Date.now() - start;

  insertLog.run(toolName,
    JSON.stringify(request.params.arguments),
    "ok", latency, 0, null);
  return result;
});

What You Can Do With the Accumulated Logs

After running this for a few days, mcp-audit.db had enough data to be useful. More useful than I expected.

Tool call frequency — see which tools get called most at a glance.

SELECT tool_name, COUNT(*) as calls, ROUND(AVG(latency_ms)) as avg_ms
FROM audit_log WHERE blocked = 0
GROUP BY tool_name ORDER BY calls DESC LIMIT 10;

In my case, notion-search was the runaway #1. The agent has this pattern of searching Notion before doing anything else. Seeing this made me think caching Notion search results might be worthwhile.

Block rate — what percentage of calls hit the rate limit.

SELECT tool_name,
  SUM(CASE WHEN blocked = 1 THEN 1 ELSE 0 END) as blocked,
  COUNT(*) as total,
  ROUND(100.0 * SUM(blocked) / COUNT(*), 1) as block_rate
FROM audit_log GROUP BY tool_name HAVING blocked > 0;

A high block rate means one of two things: the limit is too tight, or the agent has an inefficient pattern of repeatedly calling the same tool. If it’s the latter, fix the prompt.

Hourly patterns — when is your agent most active.

SELECT strftime('%H', timestamp) as hour, COUNT(*) as calls
FROM audit_log GROUP BY hour ORDER BY hour;

Unsurprising result — calls cluster around 11am-12pm when my cron jobs run. In a team environment, this data could inform MCP server load balancing decisions.

The real value of this data is that it becomes evidence for policy tuning. Instead of guessing whether “10 per day for Gmail read” is enough, you can look at actual usage patterns. In my case, gmail_read_message averaged 3 calls per day, so a limit of 10 was plenty. Meanwhile, notion-search was hitting nearly 40 per day, making the limit of 20 insufficient — I bumped it to 30.

When You Actually Need This

“Our team doesn’t use MCP that much yet” — that excuse is expiring.

Here’s a case I actually experienced: while editing a Notion page through the Notion MCP in Claude Code, I accidentally touched another team’s page. The agent picked a page with a similar title from the search results, and I hit the approve button without thinking. No data was lost, but it was embarrassing.

When this happens to one developer locally, it’s just awkward. But when 50 people on a team are using agents, each connected to 5-10 MCP servers? With no audit logs? No way to trace who called what?

The real reason enterprises need MCP Gateway isn’t security — it’s visibility. You need to see what your agents are doing.

Solutions Already Emerging

Open-source and commercial projects are already appearing under the MCP Gateway name. From what I’ve found, there are two main approaches.

1. Proxy approach — A reverse proxy between agents and MCP servers. Same architecture as existing API Gateways. Simple to configure with the advantage of reusing existing infrastructure.

2. Sidecar approach — Attach a policy engine to each MCP server. Identical to the sidecar pattern in service meshes (Istio, Linkerd). Enables finer-grained control but increases operational complexity.

For small teams, I think the proxy approach is more than enough. The sidecar route makes sense when you have 20+ MCP servers and teams needing different policies — at that scale, you probably already have dedicated platform engineers.

But This Is a Transitional Solution

Here’s where we need to think critically.

The fact that MCP Gateway is needed means the MCP protocol itself is missing a governance layer. We put API Gateways on top of HTTP not because HTTP lacks authentication, but because we need business logic and traffic management. Similarly, MCP will likely get protocol-level extensions for defining policies.

When that happens, today’s Gateways become legacy.

Personally, I expect something like a policy extension to be added to the MCP spec within 6 months. Given the active governance discussions since the donation to Linux Foundation, the direction seems set. But running agents with zero controls for those 6 months is risky, so the Gateway serves as a bridge solution to fill that gap.

One more thing — introducing a Gateway slows down agent response times. Adding a proxy hop is an obvious trade-off. In my local testing, each tool call added about 50-100ms of overhead. In most cases that’s imperceptible, but when an LLM calls tools 20-30 times in a single task, that’s 1-2 extra seconds total, which can affect user experience.

What’s Still Unsolved

Logging to SQLite and tuning policies based on data — that’s doable solo. But the requireApproval part — actually getting human approval — is still not properly implemented.

What I want to try next is Telegram bot integration. When a tool call with requireApproval: true comes in, the Gateway sends an approval request via Telegram and holds the request until the user taps “OK.” The idea is simple, but implementing async waiting in stdio-based MCP requires restructuring. Right now it’s a synchronous flow — request in, response out immediately.

And fundamentally, this only makes sense at the individual developer’s local environment level. For team use, you’d need authentication for the Gateway itself, multi-tenancy, a policy management UI — at that point, you should be using a product, not building one.

When giving AI agents access to tools, “what they can’t do” matters as much as “what they can do.” MCP Gateway is the most pragmatic starting point for the latter, and just adding SQLite makes “what my agent is actually doing” visible. From there, policies can be driven by data.

Read in Other Languages

Was this helpful?

Your support helps me create better content. Buy me a coffee! ☕

About the Author

JK

Kim Jangwook

Full-Stack Developer specializing in AI/LLM

Building AI agent systems, LLM applications, and automation solutions with 10+ years of web development experience. Sharing practical insights on Claude Code, MCP, and RAG systems.