The AI Agent Framework That Made Me Rethink Everything I Knew About Hardware Controls !(Part 1)

Part 1: From DeepRacer to Natural Language Robotics

What You'll Learn

In this two-part series, you'll discover how AWS Strands Labs is making it possible to control physical robots with natural language, simulate robotic environments without hardware, and build AI-powered Python functions with built-in validation. Part 1 covers the robotics projects (Strands Robots and Robots Sim), while Part 2 explores AI Functions and practical implementation strategies.

By the end of this series, you'll understand how to get started with experimental agentic AI development, whether you have physical hardware or just want to experiment in simulation.

My Journey: From DeepRacer to Natural Language Robotics

A few years ago, at the AWS Heroes Summit, Paxton Hall gifted me something that genuinely changed how I think about AI/ML and physical hardware: an AWS DeepRacer car. If you've never held one, it's this compact, surprisingly heavy little autonomous vehicle — and the moment I unboxed it, I knew I was holding something that represented a shift in how developers could interact with machine learning. Not through a Jupyter notebook. Not through a REST API. Through a thing that moved in the real world based on a model I trained.

I took it home and immediately started experimenting. I set up a makeshift track in my living room using tape and cardboard boxes. I trained a reinforcement learning model in the AWS console, deployed it to the car, and watched it confidently drive straight into my bookshelf. Then I retrained it. Then it drove into the bookshelf again, but slightly slower — which I chose to interpret as progress. After a few iterations (and one near-miss with my laptop bag), the car was actually navigating the track. It wasn't perfect. But it was mine — a model I built, running on hardware I could hold, making decisions in the physical world.

That experience planted a seed. What if the gap between "AI model" and "physical action" could be made even smaller? What if you didn't need to understand reinforcement learning theory, reward functions, and track geometry just to get started? What if you could just... tell a robot what to do?

That's exactly what AWS shipped on February 23, 2026 — and it's called Strands Labs.

Understanding Strands: The Foundation

If you haven't been following the Strands Agents SDK, here's the quick version: AWS open-sourced it in May 2025, and it's been downloaded over 14 million times since. The SDK — available in both Python and TypeScript — takes a "model-driven" approach to building AI agents. Instead of you writing elaborate orchestration logic, the model itself drives the agent loop. It's simple, it scales, and it's been battle-tested from quick prototypes all the way to enterprise production workloads.

Strands Labs is the experimental arm of this ecosystem. Think of it as the R&D lab that doesn't have to worry about the production release cycle. It's a separate GitHub organization where AWS teams (and now all of Amazon's development teams) can ship frontier experiments with clear use cases, functional code, and tests — without coupling those experiments to the main SDK.

At launch, three projects dropped. In this post, I'll walk you through the two robotics-focused projects. Part 2 will cover AI Functions and practical implementation strategies.

Project 1: Strands Robots — "Place the Apple in the Basket"

This is the one that made me stop scrolling — and immediately think back to my DeepRacer days.

With DeepRacer, the feedback loop was: train model → deploy → watch car → retrain. It was powerful, but the "tell the car what to do" part was entirely encoded in a reward function. You couldn't just say "stay in the center of the lane." You had to mathematically define what that meant.

Strands Robots flips that entirely. Here's the complete code that controls a physical robotic arm:

from strands import Agent
from strands_robots import Robot

# Create robot with cameras
robot = Robot(
    tool_name="my_arm",
    robot="so101_follower",
    cameras={
        "front": {"type": "opencv", "index_or_path": "/dev/video0", "fps": 30},
        "wrist": {"type": "opencv", "index_or_path": "/dev/video2", "fps": 30}
    },
    port="/dev/ttyACM0",
    data_config="so100_dualcam"
)

# Create agent with robot tool
agent = Agent(tools=[robot])
agent("place the apple in the basket")

That's it. You're not writing motor control code. You're not managing servo positions. You're telling an agent what you want in plain English, and the framework figures out the rest.

The DeepRacer parallel: With DeepRacer, I spent hours tuning hyperparameters and reward functions to get the car to do something I could describe in one sentence: "stay on the track." With Strands Robots, that one sentence is the instruction. The gap between human intent and machine action has collapsed dramatically.

The Architecture: System 1 Meets System 2

Here's what's actually clever about how this works — and this is the part that took me a minute to appreciate.

The system uses NVIDIA GR00T, a Vision-Language-Action (VLA) model, for the low-level physical control. GR00T takes camera images, robot joint positions, and language instructions as input, and directly outputs target joint positions. It runs on NVIDIA Jetson edge hardware — meaning the millisecond-level physical control happens at the edge, not in the cloud.

But when the robot encounters something that requires deeper reasoning — multi-step planning, historical pattern matching, anything that needs more than fast reflexes — it delegates to cloud-based LLMs like Amazon Bedrock models.

This is a dual-system architecture that maps beautifully to Kahneman's System 1 and System 2 thinking:

System 1 (GR00T VLA): Fast, automatic, sensorimotor. 40–160ms inference latency. Handles the "just pick up the block" part.
System 2 (Strands Agent / Claude): Slow, deliberate, reasoning. Handles "wait, the block is behind the cup, I need to move the cup first."

I've been running workshops across the APJC region on building AI agents with Bedrock and Strands, and I've never had a cleaner real-world analogy than this for explaining agent architectures to developers.

Supported Hardware

The supported hardware list is already impressive:

SO-100/SO-101 desktop arms
Fourier GR-1 humanoid arms
Bimanual Panda
Unitree G1

It integrates with Hugging Face's LeRobot for hardware abstraction, which means you're not locked into one vendor's ecosystem.

Getting Started

Prerequisites:

Python 3.12+
NVIDIA Jetson device (for edge inference)
SO-101 robotic arm (or other supported hardware)
AWS account with Bedrock access

Installation:

pip install strands-robots

The quick start guide walks you through camera setup, GR00T inference service initialization, and your first natural language robot control.

Project 2: Strands Robots Sim — Fail Fast, Fail Safely

Here's a problem I know intimately from my DeepRacer days: hardware is unforgiving. Every time I wanted to test a new reward function, I had to wait for a training job to complete, deploy to the car, and physically watch it run. If something went wrong — and it often did — I'd pick up the car, reset it on the track, and start again.

Strands Robots Sim solves this by giving you a full 3D physics-enabled simulation environment. You get:

Libero benchmark environments (90+ tasks covering spatial reasoning, object manipulation, goal-conditioned tasks)
GR00T policy integration via ZMQ
MP4 video recording of episodes
Two execution modes for different use cases

Two Execution Modes

SimEnv Mode is the "fire and forget" approach. You give the agent a task, it runs to completion, you get the final result. Great for benchmarking and well-defined tasks.

SteppedSimEnv Mode is where it gets interesting for research. The agent observes the simulation every N steps, sees camera feeds, and can adapt its instructions based on what it sees. It's slower, but it enables something powerful: visual grounding with error recovery.

from strands import Agent
from strands_robots_sim import SteppedSimEnv, gr00t_inference

stepped_sim = SteppedSimEnv(
    tool_name="my_stepped_sim",
    env_type="libero",
    task_suite="libero_10",
    data_config="libero_10",
    steps_per_call=10,
    max_steps_per_episode=500
)

agent = Agent(
    model="us.anthropic.claude-sonnet-4-5-20250929-v1:0",
    tools=[stepped_sim, gr00t_inference]
)

The agent can now observe camera images after every 10 steps, decide whether the task is progressing correctly, and issue new instructions if something went wrong. This is hierarchical planning in action — and it's the kind of thing that used to require a PhD thesis to implement.

Performance Transparency

One thing I appreciate about the team's transparency: they've published the performance overhead numbers. SimEnv adds roughly 5–8 seconds of overhead (mostly LLM call latency). SteppedSimEnv scales with the number of iterations — expect 3–5 seconds per LLM call. These aren't hidden costs; they're documented, and the team gives you optimization tips.

Extensibility

The architecture is designed for extensibility. There's a Policy abstract base class, and while only GR00T and a mock policy are implemented today, the framework is explicitly designed to support ACT, SmolVLA, and custom VLA providers. The community is being invited to build here.

Getting Started

Prerequisites:

Python 3.12+
Docker (for Isaac-GR00T container)
No physical hardware required

Installation:

pip install strands-robots[sim]

Quick Start:

python examples/libero_example.py

Start with the mock policy first (no dependencies, no Docker required). Watch the agent complete a task in simulation. Then swap in GR00T when you're ready to go deeper.

What's Next

In Part 2, I'll cover:

AI Functions: How to write Python functions by describing them in natural language with runtime validation
Practical implementation strategies for all three Strands Labs projects
Community contribution opportunities
My recommendations for getting started based on your experience level

Take Action Now

Ready to experiment? Here's what to do next:

No hardware? Start with strands-labs/robots-sim — clone the repo and run python examples/libero_example.py with the mock policy
Have a robotic arm? Jump straight to strands-labs/robots and follow the quick start guide
Want to contribute? All repos are Apache-2.0 licensed and accepting issues and PRs

Resources:

About the Author

Vishal is an AWS Developer Advocate based in the APJC region, where he empowers developers through hands-on workshops, technical content creation, and speaking engagements. He helps developers build AI agents with Amazon Bedrock and Strands, while actively contributing to developer communities through conferences, meetups and technical sessions across the region. When he's not crashing DeepRacer cars into furniture, he's exploring innovative applications of AI in cloud security, DevOps and robotics.

Stay tuned for Part 2, where we'll dive into AI Functions and practical implementation strategies!

Disclaimer: All thoughts and opinions expressed in this blog are my own and do not represent the views of AWS or Amazon.