Online world modeling
enables
real-world
Inverse Reinforcement Learning from Observation
meaning...
NO
rewards
action supervision
pre-training
play data
failure examples
prior models
interventions
simulation
Your browser does not support video playback.
Only
15 observation-only demonstrations
and
< 40 minutes
of real-world training
from scratch
M
P
A
I
L
2
Intro
Method
Baselines
Results
Gallery
Q1
Q2
Q3
World modeling is critical
Planning enables robustness
Less supervision ≠ less performance
Positive online transfer
Video-only demonstration
Training
100%
←
→
×
Your browser does not support video playback.