I Built UberSim v2.0: A Production-Grade Urban Mobility Intelligence Platform 🚗🧠
Every time you open Uber and see a 2.1× surge multiplier, a complex system has already predicted demand, optimized prices, matched drivers, and logged events for future learning — all within milliseconds.
I wanted to understand how those systems work.
So I built UberSim v2.0.
A Python-based urban mobility intelligence platform that simulates the core engineering challenges behind modern ride-sharing marketplaces.
Instead of building another dashboard project, I wanted to recreate the intelligence layer behind a ride-sharing platform from scratch.
🚀 What's Inside?
🧠 Demand Forecasting
- Spatio-temporal demand prediction (R² = 0.89)
- Weather effects, seasonality, lag features, and neighboring zone influence
- Predicts ride demand across multiple city zones
🕸️ Graph Neural Networks
- Models the city as a graph
- Nodes = city zones
- Edges = historical trip flows
- Captures spatial mobility patterns that traditional models miss
🤖 Reinforcement Learning Pricing
Built a PPO-based surge pricing engine that learns pricing policies instead of relying on hand-crafted rules.
Optimizes multiple objectives simultaneously:
- 📈 Platform revenue
- 🚕 Driver earnings
- 😊 Rider welfare
- ⏱️ Wait times
- ⚖️ Fairness constraints
One interesting finding:
The RL agent learned to gradually increase surge prices instead of aggressively reacting to demand spikes. This behavior wasn't explicitly programmed.
⚡ Kafka-Style Real-Time Streaming
Implemented an event-driven architecture with:
- Ride request streams
- Driver status updates
- Pricing events
- Match results
Supports historical replay and live marketplace metrics.
🧠 Driver State LSTM
Predicts four operational driver states:
online_idleonline_busyrelocatingoffline
Built entirely in NumPy with Backpropagation Through Time and Adam optimization.
🧪 Counterfactual A/B Testing
Implemented production-style experimentation techniques:
- IPS (Inverse Propensity Scoring)
- Doubly Robust Estimation
- CUPED variance reduction
- Bootstrap confidence intervals
This allows evaluating policies without deploying every experiment in production.
🗺️ Multi-Modal Transit Planning
Journey planning across six transportation modes:
- 🚗 Rideshare
- 🚌 Bus
- 🚇 Subway
- 🚲 Bike
- 🛴 Scooter
- 🚶 Walking
Uses A*/Dijkstra optimization to balance:
- Travel time
- Cost
- CO₂ emissions
- Number of transfers
💡 What I Learned
The hardest problem isn't maximizing revenue.
It's maximizing revenue while remaining fair.
Without constraints, optimization naturally prioritizes high-demand areas and disadvantages low-supply neighborhoods.
Adding fairness fundamentally changes the optimization landscape.
Some other takeaways:
- RL discovers strategies humans don't explicitly program.
- GNNs capture spatial relationships that tabular models miss.
- Causal inference is essential for policy evaluation.
- Pure NumPy is more powerful than people think.
🛠️ Tech Stack
Python · Streamlit · Plotly · Stable-Baselines3 · NetworkX · NumPy · Scikit-Learn · Gymnasium
🔮 What's Next?
- [ ] Graph Attention Networks (GAT)
- [ ] Multi-Agent Reinforcement Learning
- [ ] Real Kafka Broker Integration
- [ ] WebGL City Visualization
- [ ] Real-World Dataset Integration (NYC TLC, Chicago Divvy)
🔗 GitHub
https://github.com/kh-bikash/ubersim
Feedback, ideas, and contributions are welcome 🚀













