Building Q-EOS: When Control Theory Meets Multi-Agent AI Governance
How I built a six-agent token economy governance system grounded in academic research β and what I learned about why architecture matters more than algorithms.
The Problem I Wanted to Solve
Token economies are fragile. When a stablecoin loses its peg, the typical response is a static rule: "if price drops below X, buy Y tokens." But static rules are pro-cyclical β they buy aggressively when the treasury is already stressed, and they ignore the difference between a temporary dip and a structural collapse.
I wanted to build something smarter. Not just "LLM makes decisions" β but a system where multiple specialized agents collaborate, check each other, and maintain safety guarantees even when individual components fail.
Starting with Theory, Not Code
Most hackathon projects start with a cool idea and work backwards. I started with a paper.
The Dynamic Control Buyback Mechanism (DCBM), published in arXiv:2601.09961, identifies static rule-based buybacks as a root cause of pro-cyclical volatility in token economies. The paper proposes a PID controller as the core stabilizer:
$$u(t) = K_p e(t) + K_i \int e(t)dt + K_d \frac{de(t)}{dt}$$
Where $e(t) = P_{target} - P_{current}$ is the price deviation from peg.
This gave me a concrete theoretical anchor. Q-EOS isn't a demo of API calling β it's an implementation of a formal framework, extended with multi-agent governance and LLM-powered decision transparency.
The Architecture: Six Agents, One Pipeline
The core insight was separation of concerns. Each agent does exactly one thing:
Observer β Risk β PID β Policy β Governor (Qwen-Plus) β Treasury
- Observer: fetches real-time market price
- Risk: scores threat level (price deviation triggers risk_score=80 when price < 0.97)
- PID: computes optimal intervention using Kp=3000, Ki=50, Kd=500
- Policy: dynamically adjusts intervention strength (multiplier 0.5β1.5 based on deviation, risk, and treasury health)
- Governor: Qwen-Plus makes the final APPROVE/REJECT decision with written rationale
- Treasury: executes approved actions, enforces four hard constraints
All agents communicate through a Message Bus β no agent calls another directly. This made testing and debugging dramatically easier.
The Three-Layer Safety Architecture
One design principle I kept coming back to: in financial governance, "doing nothing" is far better than "doing the wrong thing."
This shaped the safety architecture:
Layer 1: PID Control β computes ideal action
Layer 2: Qwen Governance β approves or rejects with reasoning
Layer 3: Treasury β enforces hard limits regardless of Qwen
The Treasury layer runs independent of Qwen. Even if Qwen approves an action, Treasury will block it if:
- Single transaction exceeds 10% of treasury balance
- Price is in extreme range (< 0.7 or > 1.3)
- Treasury balance falls below 5,000 USDC
- Recent net consumption exceeds 5% of treasury
This is the fail-closed principle: when uncertain, reject and hold. Never default to approving.
What Broke (And How I Fixed It)
JSON Parsing Hell
Qwen-Plus doesn't always return clean JSON. Sometimes it wraps the response in Markdown code fences:
json
{"decision": "APPROVE", "reason": "..."}
Direct json.loads(text) throws an exception. I built a three-layer parser:
- Try direct parse
- Strip Markdown fences, try again
- Regex extract the first
{...}block
The Silent Policy Layer Bug
Early in development, pid.py was sending messages directly to "Governor", bypassing Policy entirely. The Policy agent was running β but receiving zero messages, doing nothing. The six-agent pipeline was secretly a five-agent pipeline.
The fix was one line: change "Governor" to "Policy" in the message destination. But finding it required carefully tracing every message through the bus.
The USE_QWEN=False Trap
I added a fast mode (USE_QWEN=False) for development β it skips real API calls and uses local if-else rules. I accidentally left it on for one batch of runs, producing data that showed 0% rejection rate and a treasury that inexplicably grew to $55k. The numbers looked great. They were completely fake.
Lesson: always verify which mode you're actually running in before trusting simulation data.
The Baseline Experiment That Surprised Me
To validate multi-agent advantage, I ran three configurations over 30 identical market days:
| Metric | Single Agent | Single + PID | Q-EOS Multi-Agent |
|---|---|---|---|
| Final Treasury (USDC) | 45,588 (-4,412) | 50,000 (+0) | 53,351 (+3,351) |
| Execution Rate | 100% | 0% | 100% |
| Max Drawdown | 12.2% | 0% | 1.8% |
The Single+PID result was the most revealing. I gave it the exact same PID algorithm as Q-EOS β same Kp, Ki, Kd β but with a single Qwen instance handling all roles. It rejected every single proposal for 30 consecutive days.
Why? A single Qwen instance reviewing its own proposals has no separation between perception (Observer), risk scoring (Risk), and execution enforcement (Treasury). With no independent checks, it consistently judged interventions as too risky to approve.
Multi-agent architecture wasn't just better β it was the only thing that worked.
What Qwen-Plus Actually Does
Every governance decision includes a written rationale. Here's a real example from Day 340 of the 365-day simulation:
"Treasury balance (49,272.05) is sufficient to absorb the intervention of 59.045 without compromising liquidity or solvency; risk score of 80 is elevated but within acceptable operational thresholds for this asset class and intervention context; price of 0.9693 shows mild deviation but no evidence of extreme volatility or flash crash conditions β no abnormal market conditions detected."
This is what transparency looks like in practice. Every rejection is traceable. Every approval has justification. Nothing is a black box.
Deployment on Alibaba Cloud
Q-EOS runs on Alibaba Cloud ECS, calling Qwen-Plus via the DashScope API. The complete stack:
- Compute: Alibaba Cloud ECS (Ubuntu 20.04)
- LLM: Qwen-Plus via DashScope API
- Framework: Python + custom Message Bus
- Control: PID controller (from arXiv:2601.08399)
- Safety: Three-layer hard constraint system
What I Learned
1. Theory first, code second. Starting from a published paper gave Q-EOS a coherence that most projects lack. I could always answer "why did you design it this way?" with a citation.
2. Fail-closed is a principle, not a feature. When the API is unreachable, when JSON is malformed, when the agent pipeline has a bug β the system should reject and hold. Not approve by default. This applies to any system handling real resources.
3. Multi-agent separation of concerns is a governance principle, not just an engineering pattern. A single agent cannot reliably serve as its own auditor. Specialization enables both decisiveness and safety simultaneously β something a single-agent system fundamentally cannot achieve.
4. Measure everything. The baseline comparison was the most convincing part of the submission. Without it, Q-EOS is just "a multi-agent system that seems to work." With it, it's a system with a 7x reduction in max drawdown and a measurable architectural advantage.
Links
- GitHub: https://github.com/vivayang911/Q-EOS
- Demo Video: https://youtu.be/V3dSjjKAn6o
- Devpost: https://devpost.com/software/q-eos-qwen-economic-agent-society
- Paper: arXiv:2601.08399
Built for the Qwen Cloud Global Hackathon 2026 β Agent Society Track.













