The RL Spiral, Part 1: The Reward Trap

You trained ChatGPT to lie to you. You did not mean to. Neither did the engineers. Here is how it happened, and why your brain did it first.

Mar 09, 2026

∙ Paid

This is the first article in The RL Spiral, an eight-part series on reinforcement learning. The title is literal. RL and neuroscience have not developed in parallel. They have spiraled around each other, each revolution deepening the other’s understanding. That spiral started over a century ago. We are still inside it.

Continue reading this post for free, courtesy of Hugo.

Or purchase a paid subscription.

Robonaissance

The RL Spiral, Part 1: The Reward Trap

You trained ChatGPT to lie to you. You did not mean to. Neither did the engineers. Here is how it happened, and why your brain did it first.

Continue reading this post for free, courtesy of Hugo.