What is it?

Hiii!

cost-xray is a local tool for inspecting AI coding-agent requests.

It currently supports:

Claude Code
Codex

After installing it, you keep using your agent normally. No new workflow. No special prompt dance. No "please debug yourself in JSON" ritual.

You run your agent, do some coding, then open:

cx

And suddenly the black box starts looking a lot less black.

The home view groups captured sessions by agent, project, and session. You can quickly see turns, tokens, cache hit rate, and estimated cost.

The Problem It Solves

Most usage tools can tell you:

This session cost $X.

That is useful, but it is also a little like checking your restaurant bill and only seeing:

Food: $84

Cool. But what food?

cost-xray goes one level deeper. It asks:

What filled the context window?
Which parts were fresh input?
Which parts were cache reads?
Which parts were cache writes?
Which tool outputs came back into the next request?
Which MCP schemas were sitting there doing cardio in the prompt prefix?

Why This Is Different From Log-Based Usage Tools

Many usage tools read local transcripts or logs. That is useful, but it sees the conversation after the agent has already run.

The expensive and interesting part is often assembled right before the model call:

system prompts
injected tool schemas
MCP server schemas
prior messages
previous tool results
cached prefix reads and writes
thinking or reasoning blocks

Those pieces may never appear clearly in a normal chat transcript.

cost-xray looks at the actual request traffic instead. That means it can show not only the total bill, but the sources inside the request that contributed to it.

Quick Install

The repo provides a one-line installer:

curl -fsSL https://raw.githubusercontent.com/tigerless-labs/cost-xray/master/install.sh | bash

During install, it asks which agent you want to capture:

Claude Code
Codex
both

After that, open a new terminal and use your agent normally:

codex

or:

claude

Then open the dashboard:

cx

You can also check or control the background capture service:

cx status
cx stop
cx start
cx restart

Demo Flow

For a small demo, I would run a normal coding-agent task:

codex

Then ask it something realistic, like:

Inspect this repo and add a small CLI flag with tests.

After a few turns, open:

cx

From there, drill down:

agent -> project -> session -> category -> MCP server -> tool -> per-turn call

The useful part is that the dashboard does not stop at "this session used N tokens." It can show whether the spend came from fresh input, cache reads, cache writes, or output.

The detail view shows context-window occupancy and cumulative per-source cost. This is where the tool starts to feel less like a billing dashboard and more like an x-ray of the agent runtime.

Things You Can Discover

Here are a few patterns cost-xray is designed to expose.

1. Tool Schemas That Are Always Present

Coding agents often inject tool definitions into every request.

That is useful when the tool is needed. But if a large schema is always present and almost never called, it still takes up context-window space.

cost-xray can help identify that unused overhead.

2. MCP Servers That Add Prefix Weight

MCP is powerful, but every configured server can bring tool schema into the prompt prefix.

If a server contributes thousands of tokens and is never actually used, cost-xray makes that visible.

3. Huge Tool Results

Sometimes the expensive part is not the prompt. It is a giant tool output that got pulled back into the next request.

For example:

Read -> large file output -> repeated in later context

When that happens, cost-xray lets you trace cost down to the tool call and its output, not just the overall session.

4. Cache Behavior

Prompt caching can make stable prefixes cheaper, but cached tokens still occupy context-window space.

cost-xray separates fresh input, cache read, cache write, and output cost. That makes it easier to understand when caching is helping the bill but not helping the window.

How It Works

At a high level, cost-xray has three stages:

coding agent -> local capture proxy -> raw store -> materializer -> TUI

The capture layer uses mitmproxy as a local hop. It records redacted request and response data under ~/.cost-xray/.

Then a separate materializer process tokenizes, prices, and classifies the captured traffic. This matters because the proxy itself stays lightweight. The analysis happens off the request path, so it does not need to slow down the agent call.

The repo describes the architecture as:

capture raw, derive at read time

That means old sessions can be re-analyzed as the attribution logic improves.

Supported Agents

At the time of this demo, the repo lists support for:

Agent	Capture approach
Claude Code	reverse proxy mode
Codex	forward proxy plus scoped local CA

The interesting design choice is that agent-specific behavior lives behind adapters in cost_xray/adapters/.

So the shared logic can focus on:

canonical events
token classification
pricing
cache boundaries
drill-down views

Adding another agent becomes mostly a matter of writing a new adapter for that agent's wire format.

What I Like About This Project

The best part is that cost-xray treats AI coding agents like systems worth observing.

That feels right.

As these tools get more capable, the invisible runtime around them gets more complex. A single turn can include policies, tool schemas, prior state, command outputs, MCP definitions, cache behavior, and generated reasoning.

If we want to optimize agent workflows, we need to inspect more than the final chat transcript.

cost-xray gives developers a way to see the hidden budget:

what fills the window
what costs money
what is cached
what is unused
what tool outputs keep coming back

What I Would Try Next

I would use cost-xray for a week of normal agent work, then look for repeated patterns:

Which MCP servers are configured but rarely called?
Which tools return too much output?
Which projects have unusually large prefixes?
Which sessions spend heavily on cache writes?
Which prompts or workflows cause context-window bloat?

That could turn into a practical cleanup pass for an AI development setup.

Code

The project is open source here:

tigerless-labs / cost-xray

See what Claude Code and Codex actually send to the API — and what each part costs.

cost-xray

See what your AI coding agent actually sends to the API — and what each part costs.

Most usage tools read local logs. That shows the total cost of a call or session, but it misses the request-time context assembled before the model is invoked: system prompts, tool schemas, MCP blocks, tool results, cache reads/writes, and previous thinking blocks.

cost-xray captures the actual local API traffic for Claude Code and Codex, then attributes tokens and dollars back to the sources inside the request. It shows not just how much a turn cost, but why it cost that much.

Home — every session by agent · project · session, with token and dollar totals