← all repositories

sihyun-yu/REPA

A method that aligns noisy diffusion transformer states with pretrained visual encoder representations to improve training efficiency and generation quality.

1.6k stars Python Image · Video · Audio
REPA
Velocity · 7d
+2.7
★ / day
Trend
steady
star history

REPA (Representation Alignment for Generation) aligns noisy input states in diffusion models with representations from pretrained visual encoders. The method significantly improves training efficiency, speeding up SiT diffusion transformer training by 17.5x while achieving state-of-the-art image generation quality on ImageNet 256x256 benchmarks. This research targets the core training methodology for generative diffusion models.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.