← all repositories

ai-forever/Kandinsky-2

A multilingual text-to-image latent diffusion model for generating, editing, and manipulating images from text descriptions.

2.8k stars Jupyter Notebook Image · Video · Audio
Kandinsky-2
Velocity · 7d
+2.1
★ / day
Trend
steady
star history

Kandinsky 2 is a multilingual latent diffusion model that generates images from text prompts. It combines an XLM-RoBERTa text encoder, a CLIP vision encoder (ViT-bigG), and a latent diffusion U-Net to produce high-quality images. The model supports text-to-image generation, image-to-image translation, inpainting, outpainting, and ControlNet-guided generation.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.