Benchmark

Leaderboard

Humanoid loco-manipulation policies ranked by overall success rate across six whole-body tasks, each evaluated under three domain-randomization levels (L0 / L1 / L2). Scores are the mean success rate (%) over 10 rollouts per task per level.

6
Tasks
3
DR levels
9
Policies evaluated
Jun 2026
Last updated
Rank Policy Type Overall L0 L1 L2

Top 3 Overall = mean success rate across all six tasks and three levels. L0 / L1 / L2 = mean success rate at each domain-randomization level. Higher is better.

For the full per-task breakdown, see the Results table on the project page.