A Brief History of Embodied Intelligence
From Da Vinci's Mechanical Knight to Optimus
Preface
We live in an age of artificial intelligence miracles. Machines can now write poetry, generate photorealistic images, hold conversations that feel genuinely human, and pass professional exams that stump most people. Every few months brings another headline about AI conquering some cognitive task we thought was uniquely human.
But ask that same AI to pick up a coffee cup, and it’s helpless.
The gap between what machines can think and what they can do has never been wider. A large language model can explain the physics of grasping in exquisite detail—the friction coefficients, the force vectors, the optimal grip positions. It just can’t actually grasp anything.
This book is about closing that gap. It’s about how intelligence gets a body.
The contrast isn’t new. In the spring of 1997, a computer defeated the world chess champion for the first time. Deep Blue’s victory over Garry Kasparov made headlines around the world. Commentators declared a new era: machines had conquered the pinnacle of human intellect.
That same year, in a laboratory outside Tokyo, a team of Honda engineers celebrated a different milestone. After eleven years of work, their robot could finally walk across a room without falling down.
One achievement took a few years and made global news. The other took over a decade and most people never heard about it. Yet ask any roboticist which was harder, and they’ll tell you without hesitation: walking.
This is the central paradox of embodied intelligence.
The roboticist Hans Moravec noticed something strange in the 1980s. The tasks we consider intellectually demanding—chess, mathematics, logic—turned out to be relatively easy for computers. Meanwhile, the tasks we consider so simple we don’t even think about them—walking, picking things up, folding a shirt—turned out to be fiendishly difficult.
He called this Moravec’s Paradox, and it reveals something profound about the nature of intelligence.
Evolution spent hundreds of millions of years optimizing our sensory and motor systems. The neural circuits that let you catch a ball or walk across uneven ground are the product of relentless selection pressure across countless generations. They work so well that we don’t notice them working at all.
Abstract reasoning, by contrast, is an evolutionary afterthought. We’ve only been doing mathematics for a few thousand years. Our brains aren’t particularly optimized for it, which is why we find it hard—and why it was relatively easy to replicate in silicon.
The things that feel effortless to us are actually the hardest problems in the universe. We just don’t realize it because evolution already solved them.
For fifty years, the robotics industry worked around this problem rather than solving it. Industrial robots—the kind that build cars and assemble electronics—operate in carefully controlled environments where they never have to adapt to the unexpected. They’re blind, deaf, and spectacularly good at repeating the same motion ten thousand times with micrometer precision.
This was enough to transform manufacturing. It wasn’t enough to make robots useful almost anywhere else.
The world outside the factory is messy, unpredictable, and constantly changing. A home robot can’t assume the coffee cup is always in the same place. A delivery robot can’t assume the sidewalk is clear. A surgical robot can’t assume the patient’s anatomy matches the textbook.
To operate in the real world, robots need to perceive their environment, understand what they’re seeing, make decisions about what to do, and execute those decisions with their bodies—all in real time, all while things change around them. This perception-decision-action loop is what embodied intelligence requires, and each step is extraordinarily difficult.
Until recently, we couldn’t do it.
Three things changed.
First, deep learning revolutionized robot perception. Before 2012, making a robot recognize objects required painstaking hand-crafted rules for every possible thing it might encounter. After 2012, we could train neural networks on millions of images and let them figure out what mattered. Robots could finally see.
Second, reinforcement learning and simulation transformed robot learning. Training a robot in the real world is slow, expensive, and dangerous—every failure risks breaking something. But training in simulation is fast, cheap, and safe. The same robot can fail a million times in virtual reality, and all that virtual failure teaches it how to succeed in the real world.
Third—and this is the most recent development—large language models gave robots something they’d always lacked: common sense.
A robot can now understand “bring me something to help me wake up” and reach for a can of Red Bull, even though no one ever explicitly taught it that energy drinks help people wake up. It learned that association from the vast corpus of human text, and it can apply that knowledge to guide its actions in the physical world.
For the first time, robots can understand what we mean, not just what we say.
This book tells the story of how we got here—and where we might be going.
Part One covers the long prelude: five hundred years of humans trying to make machines move, from Leonardo da Vinci’s mechanical knight to the industrial robots that still dominate factory floors. We’ll see why these early approaches succeeded in controlled environments but couldn’t handle the chaos of the real world.
Part Two examines the three pillars that made modern robots possible: the revolution in machine vision, the breakthroughs in robot learning, and the quiet improvements in hardware that made capable machines affordable. Each of these stories involves decades of patient work, sudden breakthroughs, and the gradual accumulation of capability.
Part Three—the heart of the book—focuses on the present moment: the explosion of progress since 2020 that has made general-purpose robots seem suddenly achievable. We’ll meet the researchers at Google who first connected language models to robot arms. We’ll follow the startups racing to build humanoid robots, each betting on a different path to the prize. We’ll examine Tesla’s audacious entry into the field and the Chinese companies that may upend everyone’s assumptions.
Part Four takes an honest look at where we actually are—what robots can do today, what they can’t, and what might change that. We’ll examine the bottlenecks that still constrain progress and the possible breakthroughs that might remove them. And we’ll consider what it would mean for the world if we succeed.
A note on what this book is not.
It’s not a technical manual. I won’t be explaining the mathematics of reinforcement learning or the architecture of transformer networks. There are excellent resources for readers who want that depth, but my goal is different: to help you understand why things happened the way they did, and what it means.
It’s not a comprehensive encyclopedia. The history of robotics is vast, and I’ve had to make choices about what to include and what to leave out. My principle has been to focus on the developments that explain the present moment—the threads that, woven together, show how we arrived at this particular point in time.
It’s not a prediction. I’ll share my sense of where things are heading, but I hold those views with appropriate humility. The history of technology forecasting is littered with confident predictions that look absurd in retrospect. I’ll try to give you the tools to make your own judgments rather than asking you to trust mine.
What I hope to offer is a framework for understanding—a way to make sense of the headlines about robot breakthroughs, to distinguish genuine progress from hype, and to think clearly about what’s coming.
I should also explain my perspective.
I’ve spent my career at the intersection of artificial intelligence and robotics. This book is a companion to my other book, A Brief History of Artificial Intelligence, which tells the story of AI itself—how we taught machines to think. A Brief History of Embodied Intelligence picks up where that story interfaces with the physical world: how we’re teaching machines to act.
The two books can be read independently, but they illuminate each other. The breakthroughs in AI that enabled ChatGPT are the same breakthroughs now enabling robots to understand our intentions. The companies building large language models are the same companies racing to build capable robots. The questions about what AI means for humanity become even more pressing when that AI can reach out and touch the world.
I come to this subject as neither a pure optimist nor a pessimist. I believe we’re witnessing something genuinely important—a technological transformation that will reshape work, economy, and daily life in ways we’re only beginning to understand. I also believe we should approach it with clear eyes, neither dismissing the challenges nor succumbing to unfounded hype.
The stakes are too high for either complacency or panic. What we need is understanding.
One final thought before we begin.
In 1495, Leonardo da Vinci sketched designs for a mechanical knight—a suit of armor animated by pulleys and gears, capable of sitting, standing, raising its visor, and moving its arms. We don’t know if he ever built it. The sketches survived scattered across his notebooks, not fully understood until modern engineers reconstructed them five centuries later.
Leonardo was trying to solve the same problem we’re still working on: how to make a machine that moves like a living thing.
Five hundred years of progress separate his sketches from today’s humanoid robots. Those centuries saw the industrial revolution, the computer age, the AI era, and now what might become the age of embodied intelligence. At each step, we thought we were close to the goal. At each step, we discovered the goal was further than we imagined.
We may be close now. We may discover again that the goal recedes as we approach. Either way, the journey is worth understanding.
Let’s begin.
The Journey Ahead
The book is structured into four parts and twelve chapters.
Part I: The Long Prelude
Chapter 1: The Mechanical Dream
From Leonardo’s sketches to Japan’s robot factories—why industrial robots succeeded by avoiding intelligence altogether.
Chapter 2: Why Walking Is So Hard
Moravec’s Paradox explained. Why Deep Blue could beat Kasparov but couldn’t cross the room. The rise and commercial struggles of Boston Dynamics.
Part II: The Three Pillars
Chapter 3: Eyes—The Robot’s ImageNet Moment
How deep learning solved machine vision, and why seeing isn’t the same as understanding.
Chapter 4: Bodies—The Quiet Progress of Hardware
Motors, sensors, batteries: the unglamorous breakthroughs that made capable robots affordable.
Chapter 5: Brains—From Reinforcement Learning to World Models
Teaching robots through trial and error, the simulator revolution, and the beginnings of robot “imagination.”
Part III: The Era of Large Models
Chapter 6 is coming soon.


