OpenTelemetry has become the standard for observing modern systems.
But when you start building AI applications, traditional traces aren't enough.
You don't just want to know that a request happened.
You want to know:
- Which model generated the output?
- Which provider was used?
- How many tokens were consumed?
- What embedding model processed the documents?
- How much did the operation cost?
These questions become even more important when building Retrieval-Augmented Generation (RAG) systems.
Recently while contributing to Mastra, I discovered an observability gap involving RAG embedding operations.
This led me to open a pull request that introduced proper OpenTelemetry GenAI semantic mappings for RAG_EMBEDDING spans.
The Problem
Mastra already exported rich metadata for several AI operations.
However, RAG embedding spans were missing standardized GenAI semantic attributes.
As a result, observability tools could see that an embedding operation occurred, but they couldn't easily understand:
- Model information
- Provider information
- Token usage
- Embedding-specific metadata
Without standardized semantic conventions, dashboards and tracing systems lose valuable context.
This becomes a bigger issue in production environments where teams need visibility into AI workloads.
Understanding RAG Embedding Spans
A typical RAG pipeline looks like this:
Documents
↓
Chunking
↓
Embedding Model
↓
Vector Database
↓
Similarity Search
↓
LLM Generation
The embedding stage is critical.
Every document chunk gets transformed into a vector representation.
If observability data from this stage is incomplete, debugging performance issues becomes significantly harder.
Why OpenTelemetry Semantic Conventions Matter
OpenTelemetry doesn't just define traces.
It also defines semantic conventions.
These conventions create a common language for telemetry data.
Instead of every framework inventing custom field names, everyone follows the same standard.
For GenAI workloads this means tools can automatically understand attributes such as:
gen_ai.system
gen_ai.request.model
gen_ai.response.model
gen_ai.usage.input_tokens
gen_ai.usage.output_tokens
Standardization enables better interoperability across platforms and observability vendors.
The Fix
The goal was straightforward:
Map RAG embedding telemetry data to OpenTelemetry's GenAI semantic conventions.
The implementation included:
- Exporting embedding model metadata
- Exporting provider information
- Mapping token usage metrics
- Aligning span attributes with OpenTelemetry standards
- Preserving compatibility with existing tracing infrastructure
This allows downstream observability systems to understand embedding operations without requiring custom integrations.
Why This Matters for AI Engineers
As AI applications become more complex, observability becomes a first-class requirement.
Production AI systems need answers to questions like:
- Which embedding model is causing latency spikes?
- Which provider generates the highest cost?
- How many tokens are consumed during indexing?
- Which retrieval operations are failing?
Without standardized telemetry, these questions become difficult to answer.
With proper semantic conventions, observability tools can surface these insights automatically.
Lessons From Open Source
One thing I enjoy about open source is that small improvements often have larger impacts than expected.
This wasn't a flashy feature.
Users won't notice it immediately.
But maintainers, platform engineers, and teams operating AI workloads will benefit from more accurate telemetry and better visibility into their systems.
These kinds of contributions taught me an important lesson:
Not every valuable contribution adds new functionality.
Sometimes the most impactful improvements make existing systems easier to understand, monitor, and operate.
Final Thoughts
AI infrastructure is evolving rapidly.
Frameworks, observability platforms, and standards are all maturing at the same time.
Contributing to these ecosystems provides a unique opportunity to learn how modern AI systems work under the hood.
For me, this contribution was another reminder that reading unfamiliar codebases often leads to discovering interesting problems.
And occasionally, solving one of those problems helps improve the developer experience for everyone else.
If you're contributing to AI infrastructure projects, don't overlook observability.
The best AI systems aren't just intelligent.
They're observable too.
GitHub: https://github.com/Akash504-ai
Open Source Contributor | Backend Engineering | AI Systems | OSS













