River-Zhang/ICEdit
A diffusion transformer approach for instructional image editing using LoRA, achieving SOTA results with 0.5% of typical training data and only 4GB VRAM.

In-Context Edit (ICEdit) presents a novel diffusion-based approach for instructional image editing. It trains a large-scale diffusion transformer using in-context generation principles, enabling multi-turn and single-turn edits guided by text instructions. The method employs LoRA (Low-Rank Adaptation) to achieve high efficiency, requiring only 0.5% of the training data and 1% of parameters compared to prior state-of-the-art methods. A MoE (Mixture of Experts) checkpoint is also released for improved performance.