MCP Code Execution in Practice: Improving Claude Code Project Structure
Learn how to apply Anthropic's MCP Code Execution patterns to real projects with practical examples of directory structure improvements and security configurations.
Overview
In the previous post about Anthropic’s Code Execution with MCP, we explored the theoretical foundations of this paradigm-shifting approach to AI tool integration. That post covered how the traditional sequential tool calling pattern results in token explosion, increased latency, and context pollution, while Code Execution achieves 98.7% token reduction and 60% execution time improvement.
This follow-up post focuses on practical implementation. We’ll examine the actual changes made to a Claude Code project’s .claude/ directory structure, demonstrating how to apply MCP Code Execution patterns in real-world scenarios.
Structure Improvements Overview
Based on the Code Execution with MCP research, three new directories were added to the .claude/ configuration:
.claude/
├── agents/ # 17 specialized agents
├── skills/ # 4 modular capabilities (auto-discovery)
├── commands/ # 7 user workflows
├── tools/ # NEW: MCP Tool Wrapper (Code Execution pattern)
├── patterns/ # NEW: Code Execution implementation patterns
├── security/ # NEW: Security guidelines (sandbox, input validation)
├── guidelines/ # Documentation
└── settings.local.json
New Directory Purposes
| Directory | Purpose | Key Files |
|---|---|---|
tools/ | MCP Tool Wrapper pattern documentation | README.md |
patterns/ | Implementation patterns | code-execution.md, progressive-loading.md |
security/ | Security configurations | sandbox-config.md, input-validation.md |
Additionally, research documentation was created at research/anthropic-code-execution-with-mcp/ containing the README, key concepts, and improvement log.
The tools/ Directory: MCP Tool Wrapper Pattern
The tools/ directory implements three core concepts from Anthropic’s Code Execution pattern.
1. Filesystem-based Tool Discovery
Tools are organized by file structure for automatic discovery:
tools/
├── database/
│ ├── query.ts
│ └── update.ts
├── api/
│ └── fetch.ts
└── file/
├── read.ts
└── write.ts
This structure allows the AI to understand available tools through the filesystem hierarchy rather than loading all tool descriptions into context upfront.
2. Progressive Loading Pattern
The most impactful optimization: loading only the tools you need.
// Traditional: All 100 tools loaded (40,000 tokens)
const tools = {
database: { description: "...", params: {...} }, // 500 tokens
api: { description: "...", params: {...} }, // 400 tokens
file: { description: "...", params: {...} }, // 300 tokens
// ... 97 more tools
};
// Progressive Loading: Only 3 tools (1,200 tokens)
import { query } from './tools/database'; // 500 tokens
import { fetch } from './tools/api'; // 400 tokens
import { write } from './tools/file'; // 300 tokens
Result: 95% context reduction
The actual reduction depends on the ratio of used tools to total tools:
| Total Tools | Used Tools | Traditional Tokens | Progressive Tokens | Reduction |
|---|---|---|---|---|
| 10 | 3 | 4,000 | 1,200 | 70% |
| 50 | 5 | 20,000 | 2,000 | 90% |
| 100 | 3 | 40,000 | 1,200 | 97% |
3. Tool Wrapper Pattern
Each tool includes standardized metadata for consistent interfaces:
// tools/custom/my-tool.ts
import { z } from 'zod';
export const myTool = {
name: 'custom.my-tool',
description: 'My custom tool description',
parameters: z.object({
input: z.string().describe('Input parameter'),
options: z.object({
flag: z.boolean().default(false)
}).optional()
}),
async execute({ input, options }) {
// Input validation
if (!input) {
throw new Error('Input is required');
}
// Business logic
const result = await processInput(input, options);
// Return summary only (not full data)
return {
success: true,
summary: `Processed ${result.count} items`
};
}
};
Key principles:
- Zod schema validation for type-safe parameters
- Summary return instead of full data (critical for token reduction)
- Consistent interface across all tools
The patterns/ Directory: Implementation Guides
Code Execution Pattern
The code-execution.md document explains the fundamental shift from sequential tool calling to code-based orchestration.
flowchart LR
A[User Request] --> B[Model Analysis]
subgraph Traditional["Traditional Approach (150K tokens, 45s)"]
B --> C[Tool 1]
C --> D[Result → Context]
D --> E[Model Analysis]
E --> F[Tool 2]
F --> G[Result → Context]
G --> H[...]
end
subgraph CodeExec["Code Execution (2K tokens, 15s)"]
B --> I[Code Generation]
I --> J[Sandbox Execution]
J --> K[Summary Return]
end
Code Generation Example
Instead of calling tools sequentially, the AI generates code that orchestrates multiple tools:
// Model-generated code
import { query } from './tools/database';
import { updateUser } from './tools/api';
let successCount = 0;
// Local loop (executes without model calls)
for (const record of await query("SELECT * FROM users LIMIT 100")) {
if (record.status === 'active') {
const result = await updateUser(record.id, {
last_checked: new Date()
});
if (!result.error) {
successCount++;
}
}
}
// Return summary only
return `Updated ${successCount} active users`;
Why this is efficient:
- 100 database reads + 15 API updates = 2 model calls (code gen + summary)
- All loops and conditionals execute locally in the sandbox
- Intermediate results never enter the model context
Progressive Loading Pattern
The progressive-loading.md provides detailed guidance on implementing modular tool loading:
// tools/database/index.ts - Module entry point
export { query } from './query';
export { update } from './update';
// tools/index.ts - Root entry point
export * as database from './database';
export * as api from './api';
// Usage: Import only what you need
import { query } from './tools/database';
// OR with namespace
import { database } from './tools';
const result = await database.query.execute({ sql: '...' });
Best practices documented:
- Module separation by functionality
- Lazy loading with dynamic imports
- Tree-shaking support through explicit exports
- Type information separation for minimal overhead
The security/ Directory: Protection Configurations
Security is critical when AI generates and executes code. The security directory provides two essential guides.
Sandbox Configuration
The sandbox-config.md details the four layers of sandbox security:
1. Process Isolation
const sandbox = createSandbox({
runtime: 'node',
isolation: 'bubblewrap', // Linux
// isolation: 'seatbelt', // macOS
});
2. Filesystem Limits
filesystem: {
readOnly: [
'/tools', // Tool definitions
'/node_modules' // Dependencies
],
readWrite: [
'/tmp', // Temporary files
'/workspace' // Working directory
],
deny: [
'~', // Home directory
'/etc', // System config
'/.env' // Environment variables
]
}
3. Network Control
network: {
allowedHosts: [
'api.anthropic.com',
'mcp.company.com'
],
allowedPorts: [443, 80],
denyOutbound: false,
denyInbound: true
}
4. Resource Limits
resources: {
timeout: 30000, // 30-second max execution
memory: '512MB', // Memory cap
cpu: 1, // CPU core limit
maxFiles: 100, // Open file limit
maxProcesses: 10 // Subprocess limit
}
Project-Specific Sandbox
For blog automation workflows:
const blogSandbox = createSandbox({
runtime: 'node',
timeout: 60000, // 1 minute (includes image generation)
memory: '1GB',
filesystem: {
readOnly: [
'.claude/tools',
'.claude/skills',
'src/content/blog',
'src/assets/blog'
],
readWrite: [
'/tmp',
'src/content/blog', // Post creation
'src/assets/blog', // Image storage
'post-metadata.json',
'recommendations.json'
]
},
network: {
allowedHosts: [
'api.brave.com', // Brave Search
'generativelanguage.googleapis.com', // Gemini API
'analyticsdata.googleapis.com' // GA4
]
}
});
Input Validation
The input-validation.md addresses a critical finding from Anthropic’s security research: 43% of AI-generated code contains command injection vulnerabilities.
Common Vulnerability Types
- Command Injection (43%)
// Vulnerable
const result = await exec(`cat ${userInput}`);
// Safe
const allowedFiles = ['data.csv', 'report.txt'];
if (!allowedFiles.includes(userInput)) {
throw new Error('Invalid file');
}
await readFile(userInput);
- SQL Injection
// Vulnerable
const query = `SELECT * FROM users WHERE id = ${userId}`;
// Safe
const query = 'SELECT * FROM users WHERE id = ?';
const result = await db.query(query, [userId]);
- Path Traversal
// Vulnerable
const path = `./uploads/${filename}`;
// Safe
const safeName = path.basename(filename);
const fullPath = path.join('./uploads', safeName);
if (!fullPath.startsWith('./uploads/')) {
throw new Error('Invalid path');
}
Zod Schema Validation
The recommended approach using Zod for type-safe validation:
import { z } from 'zod';
const QueryParams = z.object({
sql: z.string()
.min(1, 'Query cannot be empty')
.max(1000, 'Query too long')
.regex(/^SELECT/i, 'Only SELECT allowed')
.refine(
sql => !sql.includes(';'),
'Multiple statements not allowed'
),
limit: z.number()
.int()
.min(1)
.max(1000)
.default(100)
});
export async function query(params: unknown) {
const { sql, limit } = QueryParams.parse(params);
// Safe to execute
}
Project-Specific Validations
For blog automation:
// Slug validation
const SlugSchema = z.string()
.min(1)
.max(100)
.regex(/^[a-z0-9-]+$/, 'Slug must be lowercase alphanumeric with hyphens')
.refine(
s => !s.startsWith('-') && !s.endsWith('-'),
'Slug cannot start or end with hyphen'
);
// Date validation
const PubDateSchema = z.string()
.regex(/^\d{4}-\d{2}-\d{2}$/, 'Date must be YYYY-MM-DD format')
.refine(date => {
const parsed = new Date(date);
return !isNaN(parsed.getTime());
}, 'Invalid date');
// File path validation
const BlogPostPathSchema = z.string()
.refine(p => {
const normalized = path.normalize(p);
const basePath = 'src/content/blog';
const fullPath = path.join(basePath, normalized);
return fullPath.startsWith(basePath);
}, 'Invalid blog post path');
Practical Application Results
Token Reduction Summary
| Metric | Before | After | Improvement |
|---|---|---|---|
| Tool descriptions | 40,000 tokens | 2,000 tokens | 95% reduction |
| Workflow execution | 90,000 tokens | 18,000 tokens | 80% reduction |
| API costs | $7.50 | $0.10 | 75x savings |
| Execution time | 45 seconds | 15 seconds | 67% faster |
Expected Improvements for This Project
With 7 MCP servers averaging 10 tools each:
- Total tools: 70
- Average tools used per workflow: 5
- Expected reduction: 93%
Combined with existing optimizations:
- Metadata-first architecture: 60-70% token savings
- Incremental processing: 79% savings (on unchanged content)
- Caching strategy: 58% savings
Overall projected efficiency: 85-95% token reduction
Security Posture Improvement
| Risk Area | Before | After | Mitigation |
|---|---|---|---|
| Command injection | High (43%) | Low | Input validation, whitelist |
| Unauthorized file access | Medium | Low | Sandbox filesystem limits |
| Network exfiltration | Medium | Low | Network allowlist |
| Resource exhaustion | Medium | Low | CPU/memory/timeout limits |
Future Plans
Short-term (1〜2 weeks)
-
Tool Wrapper Conversion: Convert existing Python scripts in
skills/blog-writing/scripts/to TypeScript tool wrappers with standardized interfaces -
Sandbox Integration: Implement actual sandbox environment for AI-generated code execution
Medium-term (1〜2 months)
-
Performance Benchmarks: Measure before/after metrics for:
- Token consumption per workflow
- Execution time
- API costs
-
Additional Patterns: Document and implement:
- State persistence across executions
- Error recovery patterns
- Agent-to-agent communication optimization
Long-term Vision
-
Full Code Execution Adoption: Migrate complex workflows like
/write-postto generate orchestration code instead of sequential tool calls -
Security Hardening: Implement audit logging, rate limiting, and automated vulnerability scanning
Conclusion
Applying Anthropic’s MCP Code Execution patterns to a real project involves more than understanding the theory; it requires concrete structural changes and security considerations.
The three new directories added to .claude/:
- tools/: Implements Progressive Loading for 95% context reduction
- patterns/: Documents Code Execution for 98.7% token savings
- security/: Addresses the 43% vulnerability rate with proper sandboxing and validation
These changes transform the project’s architecture from traditional sequential tool calling to a more efficient code-based orchestration model. The combination of Progressive Loading, sandbox isolation, and input validation creates a system that is both more efficient and more secure.
For the theoretical foundations behind these patterns, refer to the original post on Code Execution with MCP.
Related Resources:
Was this helpful?
Your support helps me create better content. Buy me a coffee! ☕