Fixing AI Observability: How I Added GenAI Semantic Support for RAG Embedding Spans in Mastra

OpenTelemetry has become the standard for observing modern systems.

But when you start building AI applications, traditional traces aren't enough.

You don't just want to know that a request happened.

You want to know:

Which model generated the output?
Which provider was used?
How many tokens were consumed?
What embedding model processed the documents?
How much did the operation cost?

These questions become even more important when building Retrieval-Augmented Generation (RAG) systems.

Recently while contributing to Mastra, I discovered an observability gap involving RAG embedding operations.

This led me to open a pull request that introduced proper OpenTelemetry GenAI semantic mappings for RAG_EMBEDDING spans.

The Problem

Mastra already exported rich metadata for several AI operations.

However, RAG embedding spans were missing standardized GenAI semantic attributes.

As a result, observability tools could see that an embedding operation occurred, but they couldn't easily understand:

Model information
Provider information
Token usage
Embedding-specific metadata

Without standardized semantic conventions, dashboards and tracing systems lose valuable context.

This becomes a bigger issue in production environments where teams need visibility into AI workloads.

Understanding RAG Embedding Spans

A typical RAG pipeline looks like this:

Documents
    ↓
Chunking
    ↓
Embedding Model
    ↓
Vector Database
    ↓
Similarity Search
    ↓
LLM Generation

The embedding stage is critical.

Every document chunk gets transformed into a vector representation.

If observability data from this stage is incomplete, debugging performance issues becomes significantly harder.

Why OpenTelemetry Semantic Conventions Matter

OpenTelemetry doesn't just define traces.

It also defines semantic conventions.

These conventions create a common language for telemetry data.

Instead of every framework inventing custom field names, everyone follows the same standard.

For GenAI workloads this means tools can automatically understand attributes such as:

gen_ai.system
gen_ai.request.model
gen_ai.response.model
gen_ai.usage.input_tokens
gen_ai.usage.output_tokens

Standardization enables better interoperability across platforms and observability vendors.

The Fix

The goal was straightforward:

Map RAG embedding telemetry data to OpenTelemetry's GenAI semantic conventions.

The implementation included:

Exporting embedding model metadata
Exporting provider information
Mapping token usage metrics
Aligning span attributes with OpenTelemetry standards
Preserving compatibility with existing tracing infrastructure

This allows downstream observability systems to understand embedding operations without requiring custom integrations.

Why This Matters for AI Engineers

As AI applications become more complex, observability becomes a first-class requirement.

Production AI systems need answers to questions like:

Which embedding model is causing latency spikes?
Which provider generates the highest cost?
How many tokens are consumed during indexing?
Which retrieval operations are failing?

Without standardized telemetry, these questions become difficult to answer.

With proper semantic conventions, observability tools can surface these insights automatically.

Lessons From Open Source

One thing I enjoy about open source is that small improvements often have larger impacts than expected.

This wasn't a flashy feature.

Users won't notice it immediately.

But maintainers, platform engineers, and teams operating AI workloads will benefit from more accurate telemetry and better visibility into their systems.

These kinds of contributions taught me an important lesson:

Not every valuable contribution adds new functionality.

Sometimes the most impactful improvements make existing systems easier to understand, monitor, and operate.

Final Thoughts

AI infrastructure is evolving rapidly.

Frameworks, observability platforms, and standards are all maturing at the same time.

Contributing to these ecosystems provides a unique opportunity to learn how modern AI systems work under the hood.

For me, this contribution was another reminder that reading unfamiliar codebases often leads to discovering interesting problems.

And occasionally, solving one of those problems helps improve the developer experience for everyone else.

If you're contributing to AI infrastructure projects, don't overlook observability.

The best AI systems aren't just intelligent.

They're observable too.

GitHub: https://github.com/Akash504-ai
Open Source Contributor | Backend Engineering | AI Systems | OSS

Fixing AI Observability: How I Added GenAI Semantic Support for RAG Embedding Spans in Mastra

The Problem

Understanding RAG Embedding Spans

Why OpenTelemetry Semantic Conventions Matter

The Fix

Why This Matters for AI Engineers

Lessons From Open Source

Final Thoughts

Tags

Author

Stats

Published

You Might Also Like

The Principle of Least AI

. .. . ... . .... . .... . ... .

I'm not a developer, but I built a calendar app to fix my most annoying work task

The 80/20 Rule of AI Code — Why the Last 20% Takes 80% of Your Time

Too cheap to be good? Think again.

Internmaxxing vs. Old Man Shakes Fist at Cloud