Back to Experiments
aiFeatured

MCP Is a Crutch: Why CLI Tools Are the Future of AI Agent Tooling

10 min read
mcpxmcpclitoken-efficiencyclaude-codeai-agents

I connected 3 MCP servers to my Claude Code session. Before I typed a single prompt, 30,000 tokens were gone. Just schemas. Just tool definitions. Serialized JSON descriptions of every function, every parameter, every type — dumped into the context window at startup.

With 10 MCP servers? Over 100,000 tokens consumed before the conversation even starts.

That's not a tooling strategy. That's a tax.

The Schema Bloat Crisis

MCP (Model Context Protocol) was supposed to solve tool integration for AI agents. And it does — at a cost nobody talks about. Every MCP server pre-exports its complete tool catalog into the model's context window. The model needs to "see" all tool definitions upfront to know what's available.

Context Window Cost: MCP Servers at StartupToken Efficiency

Native MCP (3 servers)

Server 1 schema injection
12k
Server 2 schema injection
10k
Server 3 schema injection
8k
Per-turn overhead (30 tools)
3.6k
Total Tokens33.6k
Cost (Sonnet)Before any work
Success Rate
336.0% of 10k token limit

MCPX (same 3 servers)

Best
Server 1 schema injection
0
Server 2 schema injection
0
Server 3 schema injection
0
Per-turn overhead
0
Total Tokens0
Cost (Sonnet)Zero upfront
Success Rate
0.0% of 10k token limit
Token Savings: 100%
Better approach: MCPX (same 3 servers)

Zero tokens upfront. That's not a typo. MCPX loads nothing into the context window at startup. Tools are discovered on demand, called through Bash, and return only what was asked for.

The OpenClaw Take: CLI Is the Natural Interface

Peter Steinberger, CEO of OpenClaw (formerly ClawBot), came back from three years away from coding and built an agent tool grounded in a simple thesis:

Quote

MCP is a crutch. CLI excels where engineers SSH into servers, diagnose failures — it is the natural interface for agentic systems.

P
Peter SteinbergerCEO, OpenClaw

His argument isn't abstract. It's structural:

MCP lacks composability. You can't pipe MCP tool outputs. You can't chain them. A task like "find all cities over 25 degrees, then filter by population" requires multiple individual MCP calls, each serialized back into the context. CLI does this in one line with pipes.

MCP returns everything. A weather API call through MCP dumps temperatures, wind speeds, humidity, UV index, and 50 more fields. You needed the rain status. The model processes all of it. Tokens wasted, attention diluted.

CLI is what developers already use. Models are trained on shell interactions. They understand Bash. They know how to compose commands, filter output, and work within terminal constraints. MCP is a foreign protocol bolted on top.

What Is MCPX

MCPX is a CLI proxy I built that transforms MCP servers into standard command-line tools. Instead of injecting schemas into the context window, MCPX exposes each MCP tool as a Bash command that AI agents can discover and call on demand.

Rendering diagram...
💡 Drag to pan • Scroll to zoom • Click controls to zoom

The key insight: MCP tools belong in the terminal. The agent doesn't need a JSON schema to know what mcpx serena search_for_pattern --substring_pattern "UserAuth" does. It reads the command, understands the intent, and processes the output — just like any other CLI tool.

How Discovery Works

Instead of pre-loading tool definitions, agents discover tools lazily:

On-Demand Tool Discovery

Each --help call costs a handful of tokens. The agent only discovers what it needs, when it needs it. Compare this to MCP, which front-loads every tool definition for every server on every single turn.

The Numbers Don't Lie

Research from multiple independent benchmarks paints a consistent picture:

MetricNative MCPCLI ApproachImprovement
Upfront token cost (30 tools)~30,0000100%
Per-turn overhead~3,6000100%
Token reduction over 15 turnsbaseline96-99% fewer96-99%
Task completion scorebaseline+28% higher28%
GitHub MCP (93 tools) startup~55,0000100%
Tip

The CLI approach doesn't just save tokens — it achieves a 28% higher task completion score with roughly the same total token count. Tokens are spent on actual problem-solving instead of schema processing.

The enterprise reality is even more dramatic. Twenty MCP servers exposing twenty tools each means four hundred tool definitions serialized into the context window. Every. Single. Turn.

MCPX Architecture

MCPX is built in Go with zero runtime dependencies. A single binary that supports:

MCPX Design Principles
1

Zero Context Overhead

No schemas injected at startup. Tools discovered on demand via --help flags. The model context stays clean for actual work.

2

Daemon Mode

Heavy MCP servers stay resident between calls via Unix sockets. Sub-millisecond startup time for subsequent invocations. No cold-start penalty.

3

Unix Composability

Standard stdin/stdout pipes. Chain commands naturally. Use printf for complex JSON args. Works with every shell tool in existence.

4

Secure by Default

Secrets resolve at runtime from OS keychains — never on disk or in logs. Strict variable parsing prevents shell injection. Direct process execution.

Configuration

MCPX uses a two-level YAML configuration — global at ~/.mcpx/config.yml and project-level at .mcpx/config.yml:

yaml
1servers:
2  serena:
3    command: serena
4    args: ['--workspace', '${GIT_ROOT}']
5    daemon: true
6    env:
7      API_KEY: '${keychain:serena-api-key}'

Dynamic variables resolve git metadata, environment variables, and OS keychain secrets automatically. The agent never sees credentials — they're resolved at the process level.

Tool Composition

This is where CLI fundamentally wins over MCP. With MCPX, tools compose through standard Unix patterns:

Unix Composability in Action

Try doing this with MCP. Each call goes through the model, back to the server, serialized into JSON, deserialized, and injected into context. MCPX keeps data in the shell where it belongs — the model only sees what it explicitly requests.

The Scalability Crisis MCP Can't Solve

Here's the uncomfortable truth about MCP at scale:

MCP at Enterprise Scale
-20 MCP servers × 20 tools each = 400 tool definitions serialized into context every turn. The context window is consumed by tool definitions before the user's question even begins. Models choke on schema processing instead of doing actual work.
+20 servers registered in MCPX = 0 tokens upfront. Agent discovers tools on demand. Each server call costs only the tokens for the command and its response. Context stays lean, model stays focused.

Anthropic themselves recognized this problem. Their Tool Search feature loads a search index instead of every schema, dropping token usage by ~85%. But that's a band-aid on a fundamental architectural issue: MCP's design assumes pre-loading is the only way to give models tool awareness.

CLI doesn't make that assumption. The model already knows how to run commands and read --help output. There's nothing to pre-load.

The Broader Shift

The industry is moving. Enterprise agents are shifting from prompt-based tool invocation to code execution-driven control flow. MCP is being treated as a collection of local SDKs rather than remote tools described in text. Data passes by reference through variables, with only small summaries surfacing back to the model.

MCPX sits at the leading edge of this shift. It's a bridge between AI agents and CLI-based tool access — keeping the power of MCP servers while eliminating the token overhead that makes them impractical at scale.

Info

MCPX is MIT licensed and available now. It works with any MCP server and any AI agent that can run Bash commands. The migration from native MCP to MCPX takes minutes — same servers, same tools, zero token overhead.

Getting Started

Install MCPX

The Bottom Line

MCP solved a real problem: giving AI agents access to external tools. But its architecture — pre-load everything, serialize every schema, consume tokens on every turn — doesn't scale. Not for enterprise use cases with dozens of servers. Not for long coding sessions where context is precious. Not for agents that need to be fast and focused.

CLI is the answer. It's what developers already use. It's what models already understand. It's composable, efficient, and scales to any number of tools without touching the context window.

MCPX makes the transition seamless. Same MCP servers, same tool capabilities, zero token overhead. The protocol becomes invisible — your tools become CLI commands, and your agent's context stays clean for the work that actually matters.

Stop paying the schema tax. Your context window will thank you.

Resources