thu-ml/Motus
A world model system that predicts robotic actions and future states using a Mixture-of-Transformers architecture with diffusion models.

Velocity · 7d
+6.3
★ / day
Trend
→steady
star history
Motus is a unified latent action world model designed for robotic manipulation that leverages pretrained models and motion information. It employs a Mixture-of-Transformers architecture to integrate three expert modules for understanding, action prediction, and video generation. The system adopts a UniDiffuser-style scheduler and supports training as well as inference for robotic control tasks.