← all repositories

lucidrains/pi-zero-pytorch

A PyTorch implementation of π₀, the robotic foundation model from Physical Intelligence that combines flow-matching with vision-language model components for robot action prediction.

576 stars Python Domain AppsAgents
pi-zero-pytorch
Velocity · 7d
+1.0
★ / day
Trend
steady
star history

This repository reproduces the π₀ architecture proposed by Physical Intelligence, serving as a simplified Transfusion model with influences from Stable Diffusion 3. It uses flow matching instead of diffusion for policy generation and adopts joint attention from mmDIT. The model takes vision inputs, language commands, and joint state to output robot actions, building on a pretrained PaliGemma 2B vision-language model backbone. The architecture employs Flex Attention to mix autoregressive and bidirectional attention patterns across different token types.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.