0:00
/
0:00
Transcript

Ep#17: EgoZero: Robot Learning from Smart Glasses

With Vincent Liu and Ademi Adenji

Using human data is the key to unlocking the potential of general-purpose robotics: it’s much more plentiful and can potentially be collected quickly at scale for relatively little money. But how can we actually do this?

Despite recent progress in general purpose robotics, robot policies still lag far behind basic human capabilities in the real world. Humans constantly interact with the physical world, yet this rich data resource remains largely untapped in robot learning. We propose EgoZero, a minimal system that learns robust manipulation policies from human demonstrations captured with Project Aria smart glasses, and zero robot data. EgoZero enables: (1) extraction of complete, robot-executable actions from in-the-wild, egocentric, human demonstrations, (2) compression of human visual observations into morphology-agnostic state representations, and (3) closed-loop policy learning that generalizes morphologically, spatially, and semantically. We deploy EgoZero policies on a gripper Franka Panda robot and demonstrate zero-shot transfer with 70% success rate over 7 manipulation tasks and only 20 minutes of data collection per task. Our results suggest that in-the-wild human data can serve as a scalable foundation for real-world robot learning — paving the way toward a future of abundant, diverse, and naturalistic training data for robots.

Project Site

Original Post on X

Discussion about this video

User's avatar

Ready for more?