A Brief History of Embodied Intelligence
From Da Vinci's Mechanical Knight to Optimus
“What is the boundary between the living and the mechanical? Where does the machine end and the human begin?”
We live in an age of artificial intelligence miracles. Machines can now write poetry, generate photorealistic images, hold conversations that feel genuinely human, and pass professional exams that stump most people. Every few months brings another headline about AI conquering some cognitive task we thought was uniquely human.
But ask that same AI to pick up a coffee cup, and it’s helpless.
The gap between what machines can think and what they can do has never been wider. A large language model can explain the physics of grasping in exquisite detail: the friction coefficients, the force vectors, the optimal grip positions. It just can’t actually grasp anything.
This book is about closing that gap. It’s about how intelligence gets a body.
The contrast isn’t new. In the spring of 1997, a computer defeated the world chess champion for the first time. Deep Blue’s victory over Garry Kasparov made headlines around the world. Commentators declared a new era: machines had conquered the pinnacle of human intellect.
That same year, in a laboratory outside Tokyo, a team of Honda engineers celebrated a different milestone. After eleven years of work, their robot could finally walk across a room without falling down.
One achievement took a few years and made global news. The other took over a decade and most people never heard about it. Yet ask any roboticist which was harder, and they’ll tell you without hesitation: walking.
This is the central paradox of embodied intelligence. The roboticist Hans Moravec noticed it in the 1980s: the tasks we consider intellectually demanding, chess, mathematics, logic, turned out to be relatively easy for computers. Meanwhile, the tasks we consider so simple we don’t even think about them, walking, picking things up, folding a shirt, turned out to be fiendishly difficult.
Why? Evolution spent hundreds of millions of years optimizing our sensory and motor systems. The neural circuits that let you catch a ball or walk across uneven ground are the product of relentless selection pressure across countless generations. They work so well that we don’t notice them working at all. Abstract reasoning, by contrast, is an evolutionary afterthought, a few thousand years old, barely optimized, which is why we find it hard and why it was relatively easy to replicate in silicon.
The things that feel effortless to us are actually the hardest problems in engineering. We just don’t realize it because evolution already solved them.
Moravec’s Paradox runs through every chapter of this book. It explains why industrial robots conquered factories but couldn’t cross the street. It explains why the brightest minds in computer science spent decades failing to make a robot fold a towel. And it explains why the recent breakthroughs feel so momentous: for the first time, we may be learning to do what evolution did: give intelligence a body that works.
For fifty years, the robotics industry worked around the paradox rather than solving it. Industrial robots operate in carefully controlled environments where they never have to adapt to the unexpected. They’re blind, deaf, and spectacularly good at repeating the same motion ten thousand times with micrometer precision.
This was enough to transform manufacturing. It wasn’t enough to make robots useful almost anywhere else.
The world outside the factory is messy, unpredictable, and constantly changing. A home robot can’t assume the coffee cup is always in the same place. A delivery robot can’t assume the sidewalk is clear. A surgical robot can’t assume the patient’s anatomy matches the textbook.
To operate in the real world, a robot needs to do three things in a continuous loop: perceive its environment, decide what to do, and act on that decision. All in real time. All while the world changes around it. This perception-decision-action loop is the heartbeat of embodied intelligence, and each step is extraordinarily difficult. Making a robot see took decades. Making it decide wisely is still unsolved in general. Making it act with the dexterity of a human hand remains one of the great open challenges in engineering.
The story of this book is the story of that loop: how we learned to close it, piece by piece, and why closing it completely remains so hard.
Three convergences made the current moment possible.
First, deep learning revolutionized robot perception. Before 2012, making a robot recognize objects required painstaking hand-crafted rules for every possible thing it might encounter. After 2012, neural networks trained on millions of images could learn what mattered on their own. Robots could finally see.
Second, reinforcement learning and simulation transformed how robots learn to act. Training a robot in the real world is slow, expensive, and dangerous. Every failure risks breaking something. But training in simulation is fast, cheap, and safe. A robot can fail a million times in virtual reality, and all that virtual failure teaches it how to succeed in the physical world.
Third, and this is the most recent development, large language models gave robots something they’d always lacked: common sense. A robot can now understand “bring me something to help me wake up” and reach for a can of Red Bull, even though no one ever programmed that association. It learned from the vast corpus of human text, and it can apply that knowledge to guide its actions in the physical world.
For the first time, robots can understand what we mean, not just what we say. The perception-decision-action loop is beginning to close.
But “beginning to close” is not the same as “closed.”
In early 2026, more than fifty companies worldwide are racing to build humanoid robots. Billions of dollars have been invested. Demo videos show machines that walk, talk, cook, and dance. Tesla, Google, Amazon, and dozens of startups are competing for what may be the largest market in the history of technology.
And yet: fewer than ten of those companies have placed a robot into a real commercial environment where it performed useful work. The most celebrated deployment in the industry, a robot moving plastic totes in a Georgia warehouse, handled roughly 100,000 totes in eighteen months. A single experienced human worker could do that in a few weeks.
The gap between what robots can demonstrate and what they can deliver is the defining feature of this moment. Understanding why the gap exists, and whether it will close, requires understanding the full story, from the first mechanical automata to the foundation models being trained right now.
That’s what this book is for.
This book tells the story of how we got here, and what’s at stake.
Part One covers the long prelude: five hundred years of humans trying to make machines move, from Leonardo da Vinci’s mechanical knight to the industrial robots that still dominate factory floors. We’ll see why these early approaches succeeded in controlled environments but couldn’t handle the chaos of the real world, and why walking, the thing a toddler masters in months, defeated the best engineers for decades.
Part Two examines the three pillars that made modern robots possible: the revolution in machine vision that let robots see, the breakthroughs in learning that let robots decide and act, and the quiet improvements in hardware that made capable machines affordable. Each of these stories involves decades of patient work, sudden breakthroughs, and a co-evolution between hardware and software that neither side could accomplish alone.
Part Three focuses on the explosion since 2020. We’ll meet the researchers at Google who first connected language models to robot arms and the startups racing to build humanoid robots, each betting billions on a different path. We’ll examine Tesla’s manufacturing thesis, and we’ll follow the Chinese companies that may upend everyone’s assumptions with speed and cost advantages that echo the electric vehicle revolution.
Part Four takes an honest look at where we actually are. What can robots do today, and what remains beyond their reach? Who controls the emerging physical AI infrastructure? And the question the entire book has been building toward: when machines can do what we do, what is left that is distinctly, irreducibly human?
A note on what this book is not.
It’s not a technical manual. I won’t be explaining the mathematics of reinforcement learning or the architecture of transformer networks. My goal is different: to help you understand why things happened the way they did, and what it means.
It’s not a comprehensive encyclopedia. The history of robotics is vast, and I’ve chosen to focus on the developments that explain the present moment, the threads that, woven together, show how we arrived at this particular point.
And it’s not a prediction. The history of technology forecasting is littered with confident claims that look absurd in retrospect. I’ll try to give you the tools to make your own judgments rather than asking you to trust mine.
What I hope to offer is a framework for understanding, a way to make sense of the headlines, to distinguish progress from hype, and to think clearly about what’s coming.
I should also explain my perspective.
This book is the sister story to A Brief History of Artificial Intelligence, which tells the story of how we taught machines to think. Embodied Intelligence picks up where that story meets the physical world: how we’re teaching machines to act.
The two books can be read independently, but they illuminate each other. The breakthroughs in AI that enabled ChatGPT are the same breakthroughs now enabling robots to understand our intentions. The companies building large language models are the same companies racing to build capable robots. The questions about what AI means for humanity become more pressing when that AI can reach out and touch the world.
I come to this subject as neither a pure optimist nor a pessimist. I believe we’re witnessing something genuinely important, a technological transformation that will reshape work, economy, and daily life in ways we’re only beginning to understand. I also believe we should approach it with clear eyes, neither dismissing the challenges nor succumbing to unfounded hype.
The stakes are too high for either complacency or panic. What we need is understanding.
One final thought before we begin.
In 1495, Leonardo da Vinci sketched designs for a mechanical knight, a suit of armor animated by pulleys and gears, capable of sitting, standing, raising its visor, and moving its arms. We don’t know if he ever built it. The sketches survived scattered across his notebooks, not fully understood until modern engineers reconstructed them five centuries later.
Leonardo was also an anatomist. He dissected human cadavers to understand how muscles attached to bone, how tendons transmitted force, how the hand could grip and release. Then he turned around and designed machines that mimicked the motion. He was asking the question that this book is still asking: what is the boundary between the living and the mechanical? Where does the machine end and the human begin?
Five hundred years of progress separate his sketches from today’s humanoid robots. Those centuries saw the industrial revolution, the computer age, the AI era, and now what might become the age of embodied intelligence. At each step, we thought we were close to the goal. At each step, we discovered the goal was further than we imagined.
We may be close now. We may discover again that the goal recedes as we approach. Either way, the journey is worth understanding, and the question of what remains, when the machines can do the doing, is worth asking.
Let’s begin.
Reading Guide
If you’re new to robotics and AI: Start at the beginning and read straight through. Part One gives you historical context. Part Two builds your intuition for the key technologies. Part Three brings you to the present moment, and Part Four helps you think about what comes next.
If you have a background in AI but not robotics: You might skim Chapter 1 (the historical overview) and Chapter 3 (machine vision, which will be familiar territory). Pay close attention to Chapter 2 (why walking is so hard), Chapter 4 (hardware), and Chapter 5 (robot learning and simulation). These cover the robotics-specific challenges that distinguish embodied intelligence from the AI you know.
If you work in robotics: You’ll find the historical chapters useful for context you may not have encountered, and the company chapters (6–9) offer a synthesis of the competitive landscape. Part Four’s analysis of deployment reality and the question of what embodied machines mean for human identity might challenge or confirm your existing views.
If you’re an investor or business reader: Start with Chapter 7 (the startup landscape), then read Chapters 8–9 (Tesla and China) for competitive analysis. Go back to Part Two for technical context as needed. Part Four, especially Chapter 10 on the gap between demos and deployment, and Chapter 11 on the race for physical intelligence, is where the framework for evaluating claims and timelines lives.
If you’re short on time: Read this preface, then Chapters 2 (Moravec’s Paradox), 6 (the ChatGPT moment for robotics), and 12 (what remains). These three chapters capture the core argument: why embodied intelligence is hard, what changed, and what it means.
The Journey Ahead
The book is structured into four parts and twelve chapters.
Part I: The Long Prelude
Chapter 1: The Mechanical Dream
From Leonardo's sketches to Japan's robot factories: why industrial robots succeeded by avoiding intelligence altogether.
Chapter 2: Why Walking Is So Hard
Moravec’s Paradox explained. Why Deep Blue could beat Kasparov but couldn’t cross the room. The rise and commercial struggles of Boston Dynamics.
Part II: The Three Pillars
Chapter 3: Eyes
The Robot's ImageNet Moment, and why seeing isn't the same as understanding.
Chapter 4: Bodies
Motors, sensors, batteries: the unglamorous breakthroughs that made capable robots affordable.
Chapter 5: Brains
Teaching robots through trial and error, the simulator revolution, and the beginnings of robot “imagination.”
Part III: The Era of Large Models
Chapter 6: The Robot’s “ChatGPT Moment”
How large language models gave robots common sense. The RT-2 and VLA breakthroughs at Google DeepMind.
Chapter 7: The Startup Surge
Figure, 1X, Sanctuary, and the new generation betting billions on humanoid robots.
Chapter 8: Tesla’s Optimus
Elon Musk’s gamble: is robotics a manufacturing problem or an AI problem?
Chapter 9: The Rise of Chinese Robotics
Unitree, Agibot, and whether the EV playbook will work for robots.
Part IV: The Unfinished Revolution
Chapter 10: The Gap
What “deployed” actually means. The chasm between demo videos and commercial reality.
Chapter 11: The Race
Foundation models for robots, scaling laws for physical AI, and the data problem no one has solved.
Chapter 12: What Remains
When machines can do what we do, what is left that is distinctly human? The question Leonardo was really asking, and why it matters now more than ever.


