Series: The Learn Arc โ 50 posts through the Active Inference workbench.
Previous: Part 41 โ Session ยง8.2: Eq 4.19, the quadratic free energy
Hero line. In continuous time, action is the other way to drive
Fdown. Perception changes the belief; action changes the world so the sensors finally agree with the prediction. Same equation, different variable.
The other gradient
Session 8.2 derived Eq 4.19 and then ran โF/โฮผ to update the belief. Session 8.3 runs โF/โa โ the gradient with respect to action โ and shows that motor control falls out without adding any new machinery.
Perception minimises F by updating ฮผ. Action minimises F by moving the sensors themselves. The agent has two handles on the same quantity.
Five beats
Action enters through the sensors. The agent's predictions live in
g(ฮผ). Observations arrive aso. The sensory error iso โ g(ฮผ). Action changesoโ indirectly, through the body and the world โ so the agent can close the loop either by updatingฮผor by changing what the sensors report.The gradient
โF/โaonly "sees" the sensors. BecauseFdepends on action only through the sensory term, the action policy is driven entirely by sensory prediction error. Elegant โ and controversial.Reflexes are action minimising
F. A knee-jerk is "sensor disagrees with expected posture โ muscle contracts to restore it." In Eq 4.19 language, that is pureโF/โadescent with high sensory precision at the proprioceptive channels.Goal-directed action = set the prior, let action chase it. Instead of rewards, the agent sets
C(preferences) on the expected sensory trajectory. Action then drives sensors to matchC. Desire is just a prior the sensors have not yet caught up to.Precision weighting selects what action does. Which sensor channel has the highest precision determines what action will chase. Biology calls this "attention." The workbench exposes it as a slider.
Why it matters
This is the moment the framework stops needing a separate planner. In discrete time you enumerated policies and scored them with EFE. In continuous time you just follow -โF/โa. There is no separate "control" step โ there is only free energy, flowing down two gradients at once.
Quiz
- Why does
โF/โacontain only the sensory term, not the prior or dynamical terms? - What happens when sensory precision is very low compared to dynamical precision?
- How does setting
Cto a non-zero expected trajectory produce goal-directed behavior?
Run it yourself
mix phx.server
# open http://localhost:4000/learn/session/8/s3_action_on_sensors
Cookbook recipe: continuous/action-gradient โ a continuous-time agent that tracks a moving target. Toggle between "perception only" (freeze action) and "action only" (freeze belief) to see that each gradient alone fails; together they close the loop.
Next
Part 43: Session ยง8.4 โ Continuous play. The final Chapter 8 session. A free-form continuous-time playground: change the precisions on the fly, poke the agent, swap the world dynamics, watch the belief retune. The session that builds physical intuition for every knob you just met.
Powered by The ORCHESTRATE Active Inference Learning Workbench โ Phoenix/LiveView on pure Jido.








