Analyzing mcp-builder: A Complete Guide to MCP Server Development
Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. This comprehensive analysis covers the mcp-builder skill's 4-phase workflow, Python implementation patterns, and practical evaluation strategies.
📚 Source Information
ℹ️ This article was automatically imported and translated using Claude AI.
Analyzing mcp-builder: A Complete Guide to MCP Server Development
mcp-builder is a Claude skill that provides comprehensive guidance for creating high-quality MCP (Model Context Protocol) servers. This skill enables LLMs to interact with external services through well-designed tools, supporting both Python (FastMCP) and Node/TypeScript (MCP SDK) implementations.
This is a production-ready skill from the Anthropic skills repository, designed to guide developers through the complete MCP server development lifecycle from planning to evaluation.
Overview
What is mcp-builder?
Based on the description: Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
Core Purpose
The mcp-builder skill aims to:
- Guide developers through agent-centric MCP server design
- Provide comprehensive implementation patterns for Python and TypeScript
- Establish evaluation-driven development practices
- Create tools optimized for LLM context limitations
- Enable systematic external service integration
Target Audience
This skill is designed for:
- Developers building MCP servers for external API integration
- Teams creating reusable tools for Claude and other LLMs
- Engineers implementing Model Context Protocol specifications
- Anyone interested in LLM-external service communication patterns
Skill Anatomy
Directory Structure
SKILL.md Structure
Every skill begins with metadata in YAML frontmatter:
---
name: mcp-builder
description: "Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK)."
license: Complete terms in LICENSE.txt
---Key Components
Scripts
Scripts provide deterministic, reusable code that Claude can execute. mcp-builder includes sophisticated Python scripts for MCP server connection handling and evaluation.
The mcp-builder includes two powerful scripts:
connections.py: Abstract MCP connection handling across different transport protocols (stdio, SSE, HTTP)
evaluation.py: Comprehensive evaluation harness for testing MCP server effectiveness with Claude
Technical Deep Dive
4-Phase MCP Server Development Workflow
The skill follows a structured workflow with four major phases:
Phase 1: Deep Research and Planning - Understand agent-centric design principles, study MCP protocol and API documentation, create comprehensive implementation plan
Phase 2: Implementation - Set up project structure, implement core infrastructure, build tools systematically following language-specific best practices
Phase 3: Review and Refine - Code quality review, test and build verification, quality checklist validation
Phase 4: Create Evaluations - Design comprehensive evaluation scenarios, create 10 complex questions, generate evaluation XML
How It Works
mcp-builder demonstrates sophisticated skill design with progressive disclosure, extensive reference documentation, and practical tooling.
Trigger Detection: Claude identifies when this skill should be used based on MCP server development queries and the detailed description
Context Loading: SKILL.md content (13KB, 329 lines) loads into Claude's context window with comprehensive workflow documentation
Resource Access: Reference documentation loaded on-demand during implementation phases for Python, TypeScript, and evaluation patterns
Execution: Claude follows the 4-phase process, using bundled scripts for connection handling and evaluation as needed
Phase 1: Deep Research and Planning
This phase establishes the foundation for agent-centric MCP server design:
1.1 Understand Agent-Centric Design Principles
Build for Workflows
Don't simply wrap API endpoints—build thoughtful, high-impact workflow tools that consolidate related operations and enable complete tasks
Optimize for Limited Context
Agents have constrained context windows—return high-signal information, provide concise/detailed options, and treat context budget as a scarce resource
Design Actionable Error Messages
Error messages should guide agents toward correct usage patterns with specific next steps and educational feedback
Follow Natural Task Subdivisions
Tool names should reflect human task thinking with consistent prefixes for discoverability around natural workflows
Use Evaluation-Driven Development
Create realistic evaluation scenarios early and let agent feedback drive tool improvements through rapid prototyping and iteration
1.3-1.6 Research and Planning Activities
The skill guides developers through:
- MCP Protocol Documentation: Fetch from
https://modelcontextprotocol.io/llms-full.txt - Framework Documentation: Load SDK-specific guides for Python or TypeScript
- API Documentation: Exhaustively study target service API documentation
- Implementation Plan: Create detailed plans for tool selection, shared utilities, input/output design, and error handling
Phase 2: Implementation
This phase focuses on systematic MCP server construction:
2.1 Set Up Project Structure
- Create single
.pyfile or organize into modules - Use MCP Python SDK for tool registration
- Define Pydantic models for input validation
- Create proper project structure with
package.jsonandtsconfig.json - Use MCP TypeScript SDK
- Define Zod schemas for input validation
2.3 Implement Tools Systematically
Each tool requires careful design of input schemas, comprehensive documentation, and proper error handling
For each tool, developers should:
- Define Input Schema: Use Pydantic (Python) or Zod (TypeScript) with proper constraints and examples
- Write Comprehensive Docstrings: Include one-line summary, detailed explanation, parameter types with examples, return schema, usage examples, and error handling
- Implement Tool Logic: Use shared utilities, follow async/await patterns, support multiple response formats, respect pagination, and check character limits
- Add Tool Annotations: Include
readOnlyHint,destructiveHint,idempotentHint, andopenWorldHintas appropriate
Phase 3: Review and Refine
MCP servers are long-running processes that wait for requests over stdio/stdin or SSE/HTTP. Running them directly will cause the process to hang indefinitely.
3.2 Test and Build - Critical Considerations
Safe Testing Approaches:
- Use the evaluation harness (recommended)
- Run the server in tmux to keep it outside the main process
- Use timeout when testing:
timeout 5s python server.py
Python Verification:
python -m py_compile your_server.pyTypeScript Build:
npm run build # Verify dist/index.js is createdPhase 4: Create Evaluations
The most sophisticated aspect of mcp-builder is its evaluation framework:
4.2 Create 10 Evaluation Questions
Each question must meet six strict requirements:
- Independent: Not dependent on other questions
- Read-only: Only non-destructive operations required
- Complex: Requiring multiple tool calls and deep exploration
- Realistic: Based on real use cases humans would care about
- Verifiable: Single, clear answer that can be verified by string comparison
- Stable: Answer won't change over time
4.4 Output Format
Evaluations create XML files with this structure:
<evaluation>
<qa_pair>
<question>Find discussions about AI model launches with animal codenames...</question>
<answer>3</answer>
</qa_pair>
<!-- More qa_pairs... -->
</evaluation>Script Analysis
connections.py - Multi-Transport MCP Connection Handler
This 152-line Python module provides sophisticated connection abstraction:
class MCPConnection(ABC):
"""Base class for MCP server connections."""
async def __aenter__(self):
"""Initialize MCP server connection."""
# Handles AsyncExitStack, session initialization, and error cleanupKey Features:
- Abstract base class pattern for consistent interface
- Support for three transport protocols: stdio, SSE, HTTP
- Async context manager for resource cleanup
- Automatic session initialization and error handling
- Factory function
create_connection()for transport-agnostic instantiation
Transport Classes:
MCPConnectionStdio: Standard input/output for local serversMCPConnectionSSE: Server-Sent Events for streaming connectionsMCPConnectionHTTP: Streamable HTTP for web-based MCP servers
evaluation.py - Comprehensive MCP Server Evaluation Harness
This 374-line module provides end-to-end evaluation capabilities:
The evaluation harness tests whether LLMs can effectively use MCP servers to answer realistic, complex questions.
Core Components:
-
Evaluation Prompt: Sophisticated system prompt requiring:
- Tool usage with step-by-step summaries
- Constructive feedback on tool design
- Properly formatted XML responses
-
XML Parsing: Robust parsing of evaluation files with error handling
-
Agent Loop: Async interaction with Claude API and MCP server tools
-
Metrics Collection: Comprehensive tracking of:
- Accuracy rates
- Task durations
- Tool call counts and performance
- Feedback quality
Usage Example:
# Evaluate a local stdio MCP server
python evaluation.py -t stdio -c python -a my_server.py eval.xml
# Evaluate an SSE MCP server with authentication
python evaluation.py -t sse -u https://example.com/mcp \
-H "Authorization: Bearer token" eval.xmlUsage Examples
Basic Usage - Creating an MCP Server
mcp-builder guides you through the complete MCP server development process
Start with Research: Study MCP protocol documentation and your target API
Design Agent-Centric Tools: Focus on workflows, not just API endpoint mapping
Implement Systematically: Follow language-specific best practices with proper validation
Evaluate Thoroughly: Create 10+ complex evaluation scenarios and test with the harness
Advanced Scenario - Complex API Integration
When integrating complex APIs, mcp-builder emphasizes consolidation and workflow thinking
Example: Calendar Integration
Instead of separate tools:
- ❌
check_availability,create_event,send_invitation
Create workflow-oriented tools:
- ✅
schedule_meeting: Checks availability, creates event, and sends invitations in one operation
This approach:
- Reduces context window usage
- Minimizes agent decision points
- Provides better error handling
- Enables complete task accomplishment
Best Practices
Based on the design of mcp-builder, here are key principles for MCP server development:
Agent-Centric Design Principles
Context Budget Awareness
Treat the agent's context window as a scarce resource. Return only essential information and provide concise/detailed options
Workflow Consolidation
Combine related operations into single tools that enable complete tasks rather than exposing raw API endpoints
Educational Error Messages
Make errors actionable with specific guidance: "Try using filter='active_only' to reduce results"
Progressive Disclosure
Provide summary information by default with options for detailed exploration when needed
Idempotent Operations
Design tools that can be safely retried without side effects when appropriate
Implementation Best Practices
Input Validation: Use Pydantic (Python) or Zod (TypeScript) with proper constraints, examples, and descriptive field documentation
Comprehensive Documentation: Every tool needs one-line summary, detailed explanation, parameter types with examples, return schema, usage examples, and error handling guidance
Async Patterns: Use async/await for all I/O operations to prevent blocking
Error Handling: Implement graceful failure modes with clear, LLM-friendly error messages that prompt further action
Response Formatting: Support both JSON and Markdown formats with configurable detail levels
Common Pitfalls
Common mistakes developers make when building MCP servers
API-First Design (Wrong Approach)
Symptom: Tools map 1:1 to API endpoints
Problem: Forces agents to understand API structure and make multiple calls for simple workflows
Solution: Design tools around workflows and tasks, not API endpoints
Excessive Information Return
Symptom: Tools return complete API responses with all fields
Problem: Wastes context window on irrelevant data
Solution: Return high-signal information only, provide detailed/concise options
Poor Error Messages
Symptom: "Error 404: Not Found"
Problem: Agents don't know how to recover
Solution: "Resource not found. Try using list_resources() to see available options"
Missing Tool Annotations
Symptom: Tools lack readOnlyHint, destructiveHint, etc.
Problem: Claude cannot optimize tool usage patterns
Solution: Add appropriate annotations to all tools
Insufficient Testing
Symptom: No evaluation harness, manual testing only
Problem: Cannot measure tool effectiveness or iterate based on feedback
Solution: Create comprehensive evaluations using mcp-builder's evaluation.py
Integration with Other Skills
mcp-builder works well with:
- skill-creator - For creating new skills based on MCP patterns
- skill-article-writer - For documenting MCP server implementations
- api-contract-manager - For API specification management
- testing-frameworks - For comprehensive MCP server testing
Real-World Applications
Use Case 1: Enterprise API Integration
Scenario: A company wants to enable Claude to interact with their internal CRM system
mcp-builder Workflow:
- Research: Study CRM API documentation and identify common workflows (create lead, update contact, query opportunities)
- Design: Create workflow-oriented tools like
qualify_lead(combines data enrichment, scoring, and CRM updates) - Implement: Build Python MCP server with Pydantic validation and comprehensive error handling
- Evaluate: Create 10+ scenarios testing lead qualification, opportunity tracking, and reporting capabilities
Outcome: Agents can accomplish complex CRM tasks through natural conversation without understanding API details
Use Case 2: Data Analysis Platform
Scenario: Building an MCP server for a business intelligence platform
mcp-builder Workflow:
- Research: Understand data schemas, query capabilities, and common analysis patterns
- Design: Tools like
analyze_trend(combines data extraction, statistical analysis, and visualization) - Implement: Async Python server handling large dataset queries with pagination and truncation
- Evaluate: Test complex analytical questions requiring multi-step data processing
Outcome: Claude can perform sophisticated data analysis through conversational interfaces
Use Case 3: DevOps Automation
Scenario: Creating MCP server for infrastructure management
mcp-builder Workflow:
- Research: Study cloud provider APIs and common DevOps workflows
- Design: Safety-focused tools with confirmation prompts for destructive operations
- Implement: TypeScript server with strict validation and comprehensive logging
- Evaluate: Create tests for deployment, scaling, and monitoring scenarios
Outcome: Safe, auditable infrastructure management through natural language
Troubleshooting
MCP Server Hanging on Startup
Symptom: Server starts but process hangs indefinitely
Cause: MCP servers are long-running processes waiting for requests
Solution: Run in tmux or use timeout for testing:
timeout 5s python server.pyEvaluation Harness Connection Failures
Symptom: evaluation.py cannot connect to MCP server
Cause: Transport protocol mismatch or authentication issues
Solution:
- Verify correct transport type (stdio/SSE/HTTP)
- Check authentication credentials for remote servers
- Ensure server is running before starting evaluation
Tool Call Failures in Evaluation
Symptom: Tools execute but return errors
Cause: Input validation failures or API changes
Solution:
- Review tool input schemas for proper validation
- Check that API endpoints are accessible and unchanged
- Verify error handling provides actionable feedback
Poor Evaluation Scores
Symptom: Low accuracy on evaluation questions
Cause: Tools not designed for agent workflows or poor documentation
Solution:
- Review agent-centric design principles
- Improve tool descriptions with better examples
- Consolidate related operations into workflow tools
- Add comprehensive error handling and guidance
Context Overflow Issues
Symptom: Tools return too much data, exceeding context limits
Cause: No pagination or truncation strategy
Solution:
- Implement pagination parameters
- Add character limit checks
- Provide concise/detailed response options
- Use
25,000token ceiling as guidance
Next Steps
To use mcp-builder effectively:
- Clone the repository:
git clone https://github.com/anthropics/skills - Study the skill: Read SKILL.md thoroughly to understand the 4-phase process
- Examine the scripts: Review connections.py and evaluation.py for implementation patterns
- Start building: Choose a target API and begin Phase 1 research
- Follow the workflow: Complete all 4 phases systematically
- Evaluate thoroughly: Create and run comprehensive evaluations
- Iterate based on feedback: Use evaluation results to improve tool design
Related Resources
- Anthropic Skills Repository: github.com/anthropics/skills
- Model Context Protocol: modelcontextprotocol.io
- Python MCP SDK: github.com/modelcontextprotocol/python-sdk
- TypeScript MCP SDK: github.com/modelcontextprotocol/typescript-sdk
- MCP Builder Skill: /development/analyzing-mcp-builder
Conclusion
mcp-builder demonstrates exceptional Claude skill design through:
✅ Comprehensive Workflow: 4-phase process covering research, implementation, review, and evaluation ✅ Agent-Centric Design: Principles focused on LLM context limitations and workflow optimization ✅ Practical Tooling: Production-ready Python scripts for connection handling and evaluation ✅ Language Support: Detailed guidance for both Python and TypeScript implementations ✅ Quality Assurance: Systematic evaluation framework with rigorous testing requirements ✅ Progressive Disclosure: Structured SKILL.md with clear phases and actionable steps
The key insights from this skill can transform how developers approach MCP server development, leading to tools that truly enable LLMs to accomplish complex tasks through natural interaction with external services.
Summary
This comprehensive analysis covered:
- ✅ Skill structure and anatomy (4-phase workflow)
- ✅ Agent-centric design principles and best practices
- ✅ Python implementation patterns with Pydantic validation
- ✅ TypeScript implementation patterns with Zod schemas
- ✅ Sophisticated connection handling across transport protocols
- ✅ Comprehensive evaluation framework and harness
- ✅ Real-world applications and use cases
- ✅ Troubleshooting guide for common issues
- ✅ Integration strategies with related skills
Next Steps
Ready to build your first MCP server?
- Study the complete SKILL.md: Understand all 4 phases in detail
- Choose a target API: Start with a well-documented external service
- Follow Phase 1: Conduct thorough research and create an implementation plan
- Implement systematically: Use language-specific best practices
- Evaluate thoroughly: Create 10+ complex scenarios and test with evaluation.py
- Iterate and improve: Refine based on evaluation feedback
- Share with the community: Contribute your MCP server to the ecosystem
ℹ️ Source Information
Original Skill: mcp-builder
- Source: Anthropic Skills Repository
- Author: Anthropic
- Accessed: 2025-11-17
- License: See LICENSE.txt for full terms
This article was generated based on comprehensive analysis of the mcp-builder skill structure, scripts, and documentation patterns.
Appendix
Directory Structure Details
The mcp-builder skill includes:
Required Files:
- SKILL.md: 13KB, 329 lines of comprehensive MCP server development guidance
Scripts (2 files, 526 lines total):
- connections.py: 152 lines - Multi-transport MCP connection handling
- evaluation.py: 374 lines - Complete evaluation harness with metrics
References (not bundled, loaded on-demand):
- MCP Best Practices
- Python Implementation Guide
- TypeScript Implementation Guide
- Evaluation Guide
Complete Script Inventory
connections.py features:
- Abstract base class
MCPConnectionwith async context management - Three transport implementations (stdio, SSE, HTTP)
- Factory pattern for transport-agnostic instantiation
- Automatic session initialization and cleanup
- Comprehensive error handling
evaluation.py features:
- XML evaluation file parsing
- Anthropic Claude API integration
- Asynchronous agent loop with tool execution
- Comprehensive metrics collection (accuracy, duration, tool calls)
- Multi-format evaluation reports
- Support for all transport protocols
MCP Protocol Fundamentals
MCP (Model Context Protocol) enables LLMs to interact with external services through:
- Tools: Functions that LLMs can call with structured input
- Resources: Data that can be loaded into context
- Prompts: Reusable prompt templates
- Transport: Communication mechanism (stdio, SSE, HTTP)
Quality Metrics from Evaluation
The evaluation harness tracks:
- Accuracy: Correct answers / total questions
- Performance: Average task duration and tool call efficiency
- Tool Usage: Number and duration of tool calls per task
- Feedback Quality: Agent-reported issues and improvement suggestions
Typical benchmarks:
- Good: >80% accuracy, <30s average duration
- Excellent: >90% accuracy, <20s average duration
- Outstanding: >95% accuracy, <15s average duration
About This Project & Author
Learn about the story behind this Claude Skills project, the author's expertise, and how to get support for AI implementation.
Analyzing webapp-testing: A Complete Playwright Testing Guide
Toolkit for interacting with and testing local web applications using Playwright. This comprehensive analysis covers webapp-testing's 4K SKILL.md, 105-line with_server.py script, and practical examples for frontend verification, UI debugging, and browser automation.