MCP vs CLI: The Token Tax

For local agentic workflows, one developer, one machine, iterating fast, CLI tools composed with unix pipes are leaner on tokens, faster to iterate, and more expressive than MCP. The model already knows the unix toolchain. You don’t need to implement, schema-describe, and pay tokens for capabilities that jq and grep give you for free. MCP’s strengths are organizational. If you need centralized auth, instant distribution, black-boxed implementations, credential isolation, long-running processes, or statefulness, use MCP. If you’re building local automation and talking to your own machine, you’re paying a token tax for infrastructure you don’t need.

Every time someone tells me I should wrap my tools in MCP servers, I run the numbers and arrive at the same conclusion. Not always. But more often than people think.

The Case Against MCP (For Local Work)

You can’t chain. MCP tools are isolated function calls. You call one, get a result, then call another. There’s no tool_a | tool_b | tool_c. Every intermediate result passes through the LLM’s context window, costing tokens and adding latency. Unix pipes stream data between processes without the LLM ever seeing the intermediate state
Tool registration bloat. Every MCP tool publishes a full JSON schema (name, description, every parameter with type and description) and that schema lives in the system prompt for the entire session. Ten tools with rich parameter descriptions can easily be 3k+ tokens of overhead you pay on every single message. MCP search proxies help by lazily loading tool definitions, but they add their own latency and complexity
Iteration is slow. Change an MCP tool? Restart the server, reconnect, hope the client picks up the new schema. Change a CLI script? Save the file. That’s it. The next invocation picks up the change because there’s no persistent process caching the old version. Pretty much native hot reload
Server framework overhead. MCP requires understanding and running a server framework. CLI tools are stdin, stdout, and files. Pretty much every language has easy-to-use argument parsing in its standard library. The barrier to writing a new tool is a few lines of code, not a server scaffold

The Token Tax

The scenario is simple, we call a tool that returns your followers list. Each follower has a username, display name, country, age, follower count, and join date. 20 users in total (we are rockin that 2016 vintage “big data” here). All examples were token counted with the Anthropic API.

Step 1: Format

Both tools return the same data. The difference is that MCP returns JSON because that’s the protocol. A CLI tool can return whatever is cheapest for the LLM to read.

Here’s the JSON that MCP returns (truncated to 3 of 20):

[
  {
    "username": "alexchen",
    "display_name": "Alex Chen",
    "country": "GB",
    "age": 41,
    "followers": 12767,
    "joined": "2021-03-15"
  },
  {
    "username": "jordanmueller",
    "display_name": "Jordan Mueller",
    "country": "US",
    "age": 29,
    "followers": 18626,
    "joined": "2023-02-23"
  },
  ...
]

Here’s the same data from a CLI tool:

username          display_name       country  age  followers  joined
alexchen          Alex Chen          GB       41   12767      2021-03-15
jordanmueller     Jordan Mueller     US       29   18626      2023-02-23
samsilva          Sam Silva          BR       51   22246      2019-08-11
taylortanaka      Taylor Tanaka      FR       43   1447       2022-09-01
morgankim         Morgan Kim         FR       45   6726       2023-09-03
caseypatel        Casey Patel        BR       47   17045      2021-10-15
rileywright       Riley Wright       GB       21   11765      2020-04-23
jamiegarcia       Jamie Garcia       JP       28   24544      2025-10-06
averyschmidt      Avery Schmidt      GB       50   21686      2020-01-04
quinntakahashi    Quinn Takahashi    KR       19   5659       2025-11-27
drewsantos        Drew Santos        US       44   2709       2024-11-03
blakemartin       Blake Martin       US       27   10201      2018-07-11
sagelee           Sage Lee           IN       26   17200      2024-04-26
charliejohnson    Charlie Johnson    US       21   21035      2021-08-09
reesepark         Reese Park         US       52   13290      2020-06-23
haydenwilliams    Hayden Williams    DE       48   1879       2020-12-28
finleynakamura    Finley Nakamura    DE       27   19974      2024-02-04
rowanoliveira     Rowan Oliveira     DE       25   4757       2022-03-06
emerybrown        Emery Brown        CA       44   4818       2021-09-11
dakotasingh       Dakota Singh       JP       49   7714       2022-02-28

Format	Tokens
JSON	1,357
Compact text	487

Same signal, but with 64% fewer tokens just from dropping the repeated key names, quotes, braces, and indentation. While the LLM has no free will and chews through whatever we give it, and Anthropic is certainly more than happy to see you burn extra tokens, as the third wheel and a cheap date I want to keep it tight.

Step 2: Filtering

“But that’s not fair!” you cry out, “if you needed it to be machine parsable too you would have to use something like JSON”.

First of all, no, because any of these TSV, CSV, TOON formats are machine parsable all the same.

Now say you ask the model to “show me my followers from Germany.” Both tools output JSON this time, so its apples to apples on the format. The difference is that CLI can pipe through jq before the result enters context.

# CLI
list-followers hank | jq '[.[] | select(.country=="DE")]'

The LLM sees:

[
  {
    "username": "haydenwilliams",
    "display_name": "Hayden Williams",
    "country": "DE",
    "age": 48,
    "followers": 1879,
    "joined": "2020-12-28"
  },
  {
    "username": "finleynakamura",
    "display_name": "Finley Nakamura",
    "country": "DE",
    "age": 27,
    "followers": 19974,
    "joined": "2024-02-04"
  },
  {
    "username": "rowanoliveira",
    "display_name": "Rowan Oliveira",
    "country": "DE",
    "age": 25,
    "followers": 4757,
    "joined": "2022-03-06"
  }
]

	Tokens
MCP (all 20 users returned)	1,357
CLI + `jq` (3 users returned)	223

With MCP, the full payload travels dumps into the context window and the LLM has to scan all 20 records to find the 3 it cares about. This burns credits and can absolutely blow out your context window (pretend its a real query returning 20k tokens). With CLI, jq does the filtering in the shell and only the 3 relevant records enter context.

The tool implementation logic is identical in both cases, they both return all followers as JSON. The CLI version gets filtering for free from jq.

Filtering by country == “DE” is not a semantic operation. It’s string matching. Axiom 1, LLMs should only do what only LLMs can do, and scanning 20 JSON objects for a field value is not one of those things.

Step 3: The Schema Tax

“Ok fine, I’ll just add filtering to my MCP tool.”

You can. But now you’re paying for it twice.

Now say you want “how many followers per country?” With CLI you can answer this a few different ways:

# jq aggregation
list-followers hank | jq 'group_by(.country) | map({country: .[0].country, count: length}) | sort_by(-.count)'

# or just classic unix pipes
list-followers hank | jq -r '.[].country' | sort | uniq -c | sort -rn

# or if you just want one country
list-followers hank | jq -r '.[].country' | grep -c DE

That last one returns 3. A single character. The LLM asked a counting question and got a number back instead of 1,357 tokens of JSON to reason over.

The model already knows jq, grep, sort, uniq, wc. It can compose them however it wants across every CLI tool you write, with zero implementation work from you.

For MCP to match this, the tool author has to implement filtering logic in code and add parameters to the tool schema. Even being generous and using a single filter param that accepts expressions like country=DE instead of a dedicated param per column, plus limit, fields, and group_by, the schema goes from 98 tokens (basic) to 299 tokens. That’s 201 extra tokens in the schema, paid on every single turn whether the tool is used or not. Plus the tool author has to write and maintain a filter expression parser.

And that’s one tool. Ten tools with the same pattern and MCP is paying ~2,000 extra tokens per turn in schema overhead alone, plus all the implementation and maintenance burden. CLI still just has jq.

When MCP Wins

MCP does have real advantages that local CLI tools can’t replicate.

Centralized management and RBAC. In an organization, MCP servers can enforce authentication, authorization, and audit logging centrally. Who can call which tools, with what permissions, managed in one place. CLI tools can auth however they want, but each tool manages that independently
Central distribution. MCP tools live on a server. Update the tool once and every connected client gets the new version and schema on their next connection, that’s just how the protocol works. CLI tools have to be distributed and versioned out to every machine that uses them
Implementation as a black box. The MCP client only sees the tool’s schema and its outputs, never the implementation. For organizations that care about trade secrets or security-sensitive logic, that matters. A CLI tool is code you’re distributing to people who can read it directly
Credential isolation. MCP servers hold credentials server-side, and because the implementation is a black box, the user can only do what the server allows. With CLI tools, you’re distributing credentials to the end user’s machine. Those credentials can be used by other tools, the CLI tool itself can be modified, and you can’t guarantee the credentials won’t be used in ways you didn’t anticipate. You end up having to manage permissions much more carefully at the RBAC level because the credential boundary is the user’s entire environment, not just your tool
Long-running processes. If a tool kicks off a build or a data pipeline that takes minutes, an MCP server can handle that on persistent infrastructure. A local CLI tool ties up the user’s shell and machine resources for the duration
Effortless statefulness. CLI tools can be stateful, but they have to persist state between invocations since every run is a fresh process. MCP servers hold state in memory for free, connection pools, caches, session state, WebSocket connections. It’s not that CLI can’t do it, it’s just a different level of effort and thought