World Action Model: MotuBrain

Built on the proprietary UniDiffuser framework and a three-stream MoT architecture, MotuBrain unifies perception, planning, and execution within a single model. It couples world prediction with action generation in a unified framework, supporting VLA, world modeling, video generation, inverse dynamics, and video-to-action prediction tasks. With strong cross-embodiment adaptation, cross-task generalization, and long-horizon execution capabilities, MotuBrain enables robots to develop human-like physical intuition and complete complex end-to-end tasks in real-world environments such as homes, industrial settings, and commercial spaces.

View Technical Report

Real-World Task Demonstrations

MotuBrain can end-to-end execute long-horizon task sequences involving more than ten atomic actions, maintaining stability in subtask transitions and temporal consistency. The physical action dynamics it models demonstrate strong generality and cross-embodiment transferability, enabling unified generalization across different robotic embodiments.

World Action Model MotuBrain: A New Paradigm for Multi-Task Generalization and Scalable Evolution in Embodied Intelligence

MotuBrain is built on the UniDiffuser unified modeling framework. Through Cross-modal Priors Fusion, it integrates visual-language knowledge (VLM), video dynamics knowledge (Video Generation Model), and action skill knowledge (Action Expert) into one model, achieving unified representation and generation of language, video, and actions.

Top-Ranked in Two International Benchmark Evaluations

MotuBrain achieved first place in two major international benchmark evaluations, demonstrating strong performance in unified modeling of perception, planning, and action.

RoboTwin 2.0 Leaderboard

On RoboTwin 2.0, MotuBrain achieves 95.8 in the Clean setting and 96.1 in the Randomized setting, ranking #1 in both. It is the only model on the leaderboard to surpass an average score of 95 in randomized environments, and it reaches 100 or near-perfect scores across most individual tasks.

WorldArena Leaderboard

On WorldArena, MotuBrain ranks #1 with an overall EWM Score of 63.77, and delivers top-tier results across key motion metrics including Motion Quality, Flow Score, and Motion Smoothness, consistently leading the benchmark.

Clean Randomized EWMScore