Coherence Site 3.0

How We Automated Model Context Protocol Generation with Claude Code and Daytona

The Challenge of API Integration at Scale

At Coherence, we're building the infrastructure layer that enables products to add intelligent, agentic chat capabilities to their applications. Our goal is to get a chat interface live in your existing application, with access to all your current backend endpoints and APIs, in less than an hour. There are a lot of tricky bits here such as the chat interface UX, multi-modal and streaming support, first and third party API integrations, backend authentication and agent behavior, and easy to use client-side SDKs. Today we're going to dive into in an interesting core challenge, which is that every customer has unique APIs, data models, and business logic that our AI agents need to understand and interact with.

Enter the Model Context Protocol (MCP) - Anthropic's open standard for connecting AI assistants to external systems. While MCP provides an elegant abstraction for tool integration, manually writing MCP servers for each customer's API would be a Sisyphean task. We needed automation.

This post details our technical approach to automatically generating production-ready MCP servers using Claude Code CLI and Daytona's secure sandbox environments. If you're building AI infrastructure, integrating LLMs with existing systems, or curious about practical applications of AI-powered code generation, this is for you.

What is Coherence, and Why MCP Matters

Coherence is an API-first platform that allows any application to add sophisticated agentic chat capabilities without compromising on security, performance, or agent intelligence. Think of it as the missing layer between your application's APIs and state-of-the-art language models.

The Architecture at 30,000 Feet


┌─────────────────┐     ┌──────────────────┐     ┌────────────────────┐
│  Your Frontend  │ ──▶ │  Coherence SDK   │ ──▶ │ Coherence Backend  │
└─────────────────┘     └──────────────────┘     └─────────┬──────────┘
                                                           │
                                                           ▼
                                             ┌────────────────────────┐
                                             │   LangGraph Agent      │
                                             │   with MCP Tools       │
                                             └────────────┬───────────┘
                                                          │
                                                          ▼
                                             ┌────────────────────────┐
                                             │  Generated MCP Server  │
                                             │  (Your API Interface)  │
                                             └────────────┬───────────┘
                                                          │
                                                          ▼
                                             ┌────────────────────────┐
                                             │       Your APIs        │
                                             └────────────────────────┘

The critical piece here is the MCP server - it's what allows our agents to understand and interact with your specific APIs, maintaining proper authentication, handling rate limits, and translating between the agent's intent and your API's expectations.

Why We Chose MCP

Standardization: Instead of building custom integrations for each tool, MCP provides a consistent interface
Security: MCP servers run in isolated environments with clear boundaries
Flexibility: Supports everything from simple REST calls to complex stateful operations
Community: Growing ecosystem of tools and best practices

The MCP Generation Challenge

Here's what we're up against. A typical customer might have:

50+ API endpoints across multiple services
Complex authentication schemes (OAuth, API keys, JWTs, custom headers)
Business logic that requires specific parameter validation
Rate limiting and retry requirements
Custom error handling

Manually writing an MCP server for this would take days and require deep understanding of both the customer's API and MCP's patterns. We needed to automate this.

Enter Daytona: Secure, Ephemeral Development Environments

Our first breakthrough came from partnering with Daytona. For those unfamiliar, Daytona provides standardized, secure development environments that can be spun up programmatically. Think of it as Docker containers on remote servers, specifically designed for agentic workflows.

Why Daytona Was Critical

Security Isolation: Each MCP generation runs in a completely isolated environment
Reproducibility: Consistent environment across all generations
Resource Management: Automatic cleanup and resource limits
API Access: Programmatic control over environment lifecycle

Here's how we integrate with Daytona:

	
# Simplified example of our Daytona integration
async def create_mcp_generation_environment():
    workspace = await daytona_client.create_workspace(
        image="our-mcp-generator:latest",
        resources={
            "cpu": "2",
            "memory": "4Gi",
            "timeout": "30m"
        },
        env_vars={
            "CLAUDE_CODE_PATH": "/usr/local/bin/claude-code",
            "OUTPUT_DIR": "/workspace/generated"
        }
    )
    return workspace

Claude Code CLI: The Brain of Our Operation

The second key piece is Anthropic's Claude Code CLI. While many know Claude as a chat interface, Claude Code is a powerful command-line tool designed specifically for software development tasks

Why Claude Code CLI?

Context Window Management: Handles large codebases intelligently
Tool Use: Native support for file operations, code analysis, and more
Deterministic Outputs: Consistent code generation patterns
Error Recovery: Sophisticated retry and correction mechanisms

Our Integration Approach

We use Claude Code in a headless mode within our Daytona environments:


# Example of how we invoke Claude Code
claude-code -p "OUR PROMPT" --max-iterations 5

The Technical Deep Dive: MCP Generation Workflow

Now let's get into the meat of how this works. Our MCP generation pipeline consists of several stages:

1. API Specification Analysis

First, we analyze the customer's API documentation. The job here is to extract info about the endpoints available, the parameteres supported, and the data returned. This can be a structured spec such as an OpenAPI document, or can be other inputs such as plain-text descriptions or code snippets from the server. The claude code agent with the LLM is able to work with a wide range of data, but the old truth about "garbage-in, garbage-out" always applies!


def analyze_api_spec(spec_data):
    # Extract endpoints, parameters, auth requirements
    endpoints = extract_endpoints(spec_data)
    auth_scheme = detect_auth_pattern(spec_data)
    
    # Build a semantic understanding of the API
    api_context = {
        "endpoints": endpoints,
        "auth": auth_scheme,
        "patterns": detect_common_patterns(endpoints),
        "relationships": infer_resource_relationships(endpoints)
    }

    return api_context

2. Prompt Engineering for MCP Generation

This is where the magic happens. We've developed a sophisticated prompting framework that guides Claude Code to generate optimal MCP servers. Without revealing our secret sauce, here's the high-level approach:


def build_generation_prompt(api_context):
    # Framework generates prompts with:
    # - API context and patterns
    # - MCP best practices
    # - Error handling requirements
    # - Performance optimizations
    # - Security constraints

    prompt = PromptTemplate(
        system_context=MCP_BEST_PRACTICES,
        api_details=api_context,
        constraints=SECURITY_REQUIREMENTS,
        examples=relevant_examples(api_context)
    )

    return prompt.render()

3. Iterative Generation and Validation

Claude Code doesn't just generate code - it iterates and improves:


async def generate_mcp_server(workspace, prompt):
    # Initial generation
    await workspace.run_claude_code(prompt)

    # Validation loop
    for iteration in range(MAX_ITERATIONS):
        validation_result = await validate_generated_code(workspace)

        if validation_result.is_valid:
            break

        # Self-correction
        correction_prompt = build_correction_prompt(validation_result.errors)
        await workspace.run_claude_code(correction_prompt)

    return await workspace.get_generated_files()

4. Testing and Verification

Every generated MCP server goes through rigorous testing:


async def test_mcp_server(server_path, test_cases):
    # Spin up the MCP server
    server_process = await start_mcp_server(server_path)

    # Run test cases
    results = []
    for test in test_cases:
        result = await execute_mcp_command(
            server_process,
            test.tool_name,
            test.parameters
        )
        results.append(validate_response(result, test.expected))

    return TestReport(results)

Real-World Example: E-commerce API Integration

Let's walk through a concrete example. Imagine we're integrating with an e-commerce platform's API:

### Input: OpenAPI Specification


openapi: 3.0.0
paths:
  /products:
    get:
      parameters:
        - name: category
          in: query
          schema:
            type: string
        - name: limit
          in: query
          schema:
            type: integer

  /orders:
    post:
      security:
        - bearerAuth: []
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Order'

Generated MCP Server (Simplified)

In the example below, the Coherence SDK handles all the hard parts of passing authentication information in real time to the Coherence backend, the LangGraph agent, the MCP servers, and then your backend. You don't need to manage these security-critical transfers, and your users can chat with the same permissions and login they already have. It "just works!"


# Auto-generated by Coherence MCP Generator

import asyncio
from mcp import MCPServer, Tool, ToolResult

class EcommerceMCPServer(MCPServer):
    def __init__(self, api_base_url, auth_token):
        super().__init__()
        self.api_base_url = api_base_url
        self.auth_token = auth_token

        # Register tools
        self.register_tool(self.search_products)
        self.register_tool(self.create_order)

    @Tool(
        name="search_products",
        description="Search for products in the catalog",
        parameters={
            "category": {"type": "string", "description": "Product category"},
            "limit": {"type": "integer", "description": "Max results", "default": 10}
        }
    )
    async def search_products(self, category=None, limit=10):
        params = {"limit": limit}
        if category:
            params["category"] = category

        response = await self.http_client.get(
            f"{self.api_base_url}/products",
            params=params
        )

        return ToolResult(
            success=True,
            data=response.json()
        )

    @Tool(
        name="create_order",
        description="Create a new order",
        parameters={
            "items": {"type": "array", "description": "Order items"},
            "shipping_address": {"type": "object", "description": "Shipping details"}
        }
    )
    async def create_order(self, items, shipping_address):
        response = await self.http_client.post(
            f"{self.api_base_url}/orders",
            json={"items": items, "shipping_address": shipping_address},
            headers={"Authorization": f"Bearer {self.auth_token}"}
        )

        return ToolResult(
            success=True,
            data=response.json()
        )

Performance and Scale Considerations

Generating MCP servers at scale requires careful optimization:

1. Caching and Reuse


# We cache common patterns and components from functools import lru_cache

@lru_cache(maxsize=1000)
def get_auth_handler(auth_type, config):
    # Returns cached auth handler implementation
    pass

2. Parallel Generation


# Generate multiple tools in parallel
async def generate_tools_parallel(tool_specs):
    tasks = [generate_single_tool(spec) for spec in tool_specs]
    return await asyncio.gather(*tasks)

3. Resource Management

Daytona workspaces are auto-scaled
Generation timeout limits prevent runaway processes
Automatic cleanup of failed generations and recource archiving for cost management

Lessons Learned and Best Practices

After generating our first batches of MCP servers, here's what we've learned:

1. Prompt Engineering is Everything

The quality of generated code is directly proportional to prompt quality. We maintain a library of prompt components that we compose for different scenarios.

2. Validation is Non-Negotiable

Every generated server must pass:

Static type checking
Security scanning
Functional tests
Performance benchmarks

3. Human-in-the-Loop for Edge Cases

While automation handles 90% of cases, complex APIs benefit from human review. We've built tooling to make this review process efficient. We've also built UI to allow Coherence users to view and edit their MCP code directly, and deploy new versions whenever they want to.

4. Version Control and Rollback

Generated servers are version-controlled with clear rollback procedures. This is critical when APIs change. In the Coherence UI you can see a lits of previous versions, their timestamps, status, and other info. You can generate new versions and rollback at any time.

The Future: Where We're Heading

1. Self-Improving Generation

We're building systems where generated MCP servers learn from usage patterns and self-optimize.

2. Multi-Modal MCP

Beyond text inputs - we are integrating input and output of images, video, voice, and more. We are also adding the ability to modify the DOM in real-time with UI automation.

3. Open Source Contributions

We're working with Anthropic to contribute improvements back to the MCP ecosystem.

The Power of Composable AI Infrastructure

The combination of Daytona's secure environments and Claude Code's generation capabilities has allowed us to solve what seemed like an intractable problem: making any API instantly accessible to AI agents.

This approach - using AI to build AI infrastructure - represents a new paradigm in software development. We're not just writing code; we're building systems that write code, with all the quality and security guarantees of human-written software.

If you're building in the AI space, I encourage you to think about similar multiplicative approaches. What manual processes in your workflow could be automated with the right combination of tools?

Try It Yourself

Interested in adding intelligent chat to your application? Check out [Coherence](https://withcoherence.com) - we handle all the complexity described above, so you can focus on building great products.

Want to experiment with MCP? Start with [Anthropic's MCP documentation](https://modelcontextprotocol.org) and try building a simple server manually first.

Building agentic computing or development environments? [Daytona](https://daytona.io) is revolutionizing how we think about containerized infrastructure.

---

*Have questions or want to discuss MCP generation? Find me on [Twitter](https://twitter.com/zacharyzaro) or [HN](https://news.ycombinator.com/user?id=zoomzoom). If you're solving similar problems, I'd love to hear your approach!*

‍

Building Production-Ready MCP Servers at Scale