dbolya/tomesd
A token-merging technique that accelerates Stable Diffusion inference by merging redundant tokens in transformer blocks.

ToMe for SD applies token merging, originally developed for vision transformers, to Stable Diffusion to reduce computational redundancy. It identifies and merges similar tokens within the diffusion process, decreasing the number of computations required. The method achieves significant speedups while maintaining image quality, making it valuable for optimizing generative image pipelines. It is the official implementation from a CVPR workshop paper.