Why Google’s gRPC Update to MCP Makes AI Voice Agents 17x Faster

Google just quietly made AI voice agents significantly better, and most people haven’t noticed yet.

Here’s what happened and why it matters if you’re building voice AI:

Google contributed a gRPC transport package to the Model Context Protocol (MCP), the open standard that lets AI agents talk to external tools and services.

On the surface that sounds like a boring infrastructure change.

It isn’t.

The problem it solves

Every time your voice agent calls a tool, checking a calendar, looking up a customer record, fetching live data, it’smaking an MCP request.

With the current default (JSON-RPC over HTTP), that costs ~9ms per call. In a chat interface, nobody notices. In a voice conversation where your agent makes 4–5 tool calls per turn, those milliseconds stack up into something the human ear does notice: a pause that feels unnatural.

gRPC changes the math entirely.

Instead of opening a new HTTP connection for every tool call, gRPC holds one persistent bidirectional stream open for the entire session. Messages flow over Protocol Buffers, binary, typed, compact, instead of JSON text.

The result: ~0.5ms per tool call. Roughly 17x faster.

What this actually means for voice AI companies

The bottleneck in a voice agent isn’t the LLM anymore, it’s the round trips. STT → LLM → tool calls → TTS. Every step adds to the time between the user finishing their sentence and hearing a response.

Shaving 8ms per tool call might sound trivial. But in a high-frequency agentic loop, multiple tools, real-time data, parallel calls, it compounds. The difference between a voice agent that feels alive and one that feels laggy is often measured in tens of milliseconds.

Companies like Spotify already validated this internally. Their engineers described it as reducing the work needed to build MCP servers while gaining the familiarity and structure their teams already had with gRPC.

What stays the same

This is the part worth emphasising: MCP’s semantic layer, how tools are described, how prompts work, how agents discover capabilities, is completely untouched.

JSON-RPC over HTTP remains the default. gRPC is a first-class option, not a replacement.

The MCP SDK now supports pluggable transports. You choose the right one for your scale.

The bigger picture

We’re moving into an era where voice agents aren’t demos, they’re infrastructure. They’re handling customer calls, running internal workflows, operating in real-time environments where latency isn’t a metric, it’s a user experience.

The tooling needs to match that ambition.

Google contributing gRPC transport to MCP is a signal that the industry is taking the infrastructure layer of AI agents seriously. Not just the models. Not just the prompts. The wire.

If you’re building voice AI and still defaulting to HTTP for your tool calls, it’s worth paying attention to what just became possible.

Curious what latency improvements others are seeing in their voice agent pipelines, drop a comment below. 👇