For decades, every RL system needed a human to tell it what to look at. Then one opened its own eyes. The headlines went to the neural network. The less celebrated ingredient came from a sleeping rat.
This is the fourth article in The RL Spiral, an eight-part series on reinforcement learning. The previous article, The Curse Bellman Couldn’t Break, explained why RL needs so much experience. This one is about the moment that changed.
Continue reading this post for free, courtesy of Hugo.