We are going to build a lightweight data analyst agent that reads a CSV file, reasons about its contents, and writes Python code to generate charts and summary statistics. This is useful for developers and analysts who need to turn raw data into visual insights without writing boilerplate pandas or matplotlib code by hand.
What you'll need
- Python 3.10 or newer
pip install openai pandas matplotlib- An Oxlo.ai API key from https://portal.oxlo.ai
Step 1: Set up the Oxlo.ai client
First, import the OpenAI SDK and instantiate the client pointing at Oxlo.ai. Because Oxlo.ai is fully OpenAI API compatible, this requires only a base URL change.
from openai import OpenAI
import os
client = OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key=os.environ.get("OXLO_API_KEY")
)
Step 2: Craft the system prompt
The system prompt constrains the model to return only executable Python. Keeping the response format strict makes parsing reliable.
SYSTEM_PROMPT = """You are a Python data analyst.
The user will provide a CSV file path, a preview of the data, and a question.
Write only valid Python code that:
1. Reads the CSV into a pandas DataFrame using the variable csv_path
2. Performs the analysis or visualization requested
3. Saves any plot to 'output_chart.png' with tight layout
4. Prints key numeric results to stdout
Do not include markdown fences, explanations, or import statements for pandas or matplotlib. These are already available in the execution environment.
"""
Step 3: Load data and query the model
I will create a sample sales CSV so the tutorial is self-contained. Then I will build a helper that sends the file path, schema preview, and user question to the model.
import pandas as pd
# Create a self-contained sample dataset
sample_csv = "sales.csv"
df = pd.DataFrame({
"month": ["Jan", "Feb", "Mar", "Apr", "May", "Jun"],
"revenue": [12000, 15000, 13000, 17000, 16000, 19000],
"costs": [8000, 9500, 8200, 11000, 10000, 11500]
})
df.to_csv(sample_csv, index=False)
def build_prompt(csv_path, question, df):
schema = f"Columns: {list(df.columns)}\nTypes:\n{df.dtypes}\nPreview:\n{df.head(3).to_string()}"
return f"CSV path: {csv_path}\n{schema}\nQuestion: {question}"
user_message = build_prompt(sample_csv, "Plot revenue vs costs as a grouped bar chart and print total profit.", df)
response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_message},
],
)
generated_code = response.choices[0].message.content
print(generated_code)
Step 4: Execute the generated visualization code
The model returns raw Python strings that may contain accidental markdown fences. I strip those and execute the code in a controlled namespace that already includes pandas and matplotlib.
import matplotlib.pyplot as plt
def run_generated_code(code_string, csv_path):
cleaned = code_string.replace("
```python", "").replace("```
", "").strip()
namespace = {
"pd": pd,
"plt": plt,
"csv_path": csv_path
}
exec(cleaned, namespace)
return namespace
run_generated_code(generated_code, sample_csv)
Step 5: Wrap it in a reusable agent class
Now I will package the prompt builder, API call, and executor into a reusable class. Oxlo.ai's flat request-based pricing keeps the cost predictable even when you pass long data previews or iterate across multiple questions.
class DataVizAgent:
def __init__(self, api_key, model="deepseek-v3.2"):
self.client = OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key=api_key
)
self.model = model
self.system_prompt = SYSTEM_PROMPT
def analyze(self, csv_path, question):
df = pd.read_csv(csv_path)
schema = f"Columns: {list(df.columns)}\nTypes:\n{df.dtypes}\nPreview:\n{df.head(3).to_string()}"
user_message = f"CSV path: {csv_path}\n{schema}\nQuestion: {question}"
response = self.client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": self.system_prompt},
{"role": "user", "content": user_message},
],
)
code = response.choices[0].message.content
cleaned = code.replace("
```python", "").replace("```
", "").strip()
namespace = {"pd": pd, "plt": plt, "csv_path": csv_path}
exec(cleaned, namespace)
return namespace
Run it
Instantiate the agent and run two different questions against the same CSV. The flat per-request pricing on Oxlo.ai means you can experiment without watching token counters. See https://oxlo.ai/pricing for plan details.
agent = DataVizAgent(api_key=os.environ["OXLO_API_KEY"])
# Run 1
agent.analyze("sales.csv", "Plot revenue vs costs as a grouped bar chart and print total profit.")
# Run 2
agent.analyze("sales.csv", "Create a pie chart of revenue share by month and print the best month.")
After running the second query, stdout shows something like:
Best month: Jun with revenue 19000
And output_chart.png is written to disk.
Wrap-up
Two ways to extend this. First, add multi-turn memory so the agent refines previous charts based on follow-up questions. Second, swap in kimi-k2.6 or qwen-3-32b for reasoning-heavy statistical workloads, since Oxlo.ai hosts both under the same flat request pricing.







