← all repositories

kyegomez/RT-2

RT-2 is a Vision-Language-Action model that enables robots to perform tasks by translating visual observations and language instructions into physical actions.

RT-2
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

This repository implements Robotic Transformer 2, a multi-modal model that bridges vision, language, and robot control. The model takes in visual inputs and natural language commands, then outputs executable actions for robotic manipulation tasks. It builds on transformer architectures to generalize learned behaviors to novel situations, a key advancement in embodied AI systems.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.