Is CatVTON open source?

Yes — Zheng-Chong/CatVTON is an open-source project tracked on heatdrop.

What language is CatVTON written in?

Zheng-Chong/CatVTON is primarily written in Python.

How popular is CatVTON?

Zheng-Chong/CatVTON has 1.8k stars on GitHub.

Where can I find CatVTON?

Zheng-Chong/CatVTON is on GitHub at https://github.com/Zheng-Chong/CatVTON.

← all repositories

Zheng-Chong/CatVTON

Virtual try-on that fits in 8 GB VRAM and 50M trainable params

CatVTON swaps garments onto people in photos while keeping trainable parameters at 50 million and inference VRAM under 8 GB.

★1.8k stars Python Image · Video · Audio Domain Apps

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does CatVTON is a diffusion model for virtual try-on: give it a person image and a garment image, and it inpaints the new clothing onto the body. It is built on Stable Diffusion v1.5 inpainting and modified from the Diffusers library. The project includes a Gradio app, a ComfyUI workflow, and a HuggingFace Space demo.

The interesting bit The authors argue that concatenating inputs is enough to guide the inpainting process, avoiding heavy warping modules or auxiliary encoders. They freeze most of the 899M-parameter network and train only 49.57M parameters, yet claim inference at 1024×768 fits inside 8 GB of VRAM with bf16 mixed precision. That is unusually frugal for generative fashion tasks.

Key highlights

Lightweight architecture: 899.06M total parameters, only 49.57M trainable.
Low VRAM inference: generates 1024×768 images using less than 8 GB VRAM.
Built on Stable Diffusion v1.5 inpainting via the Diffusers ecosystem.
Automatic masking in the app and ComfyUI workflow using DensePose and SCHP.
ICLR 2025 acceptance; also offers a mask-free variant and an experimental FLUX.1 LoRA (37.4M weights) that the authors note is not yet stable.

Caveats

Licensed under CC BY-NC-SA 4.0, so commercial use is off the table without separate permission.
The ComfyUI section warns of Windows-specific issues (see issue #8).
The FLUX.1-based LoRA release is explicitly described as “not a stable version.”

Verdict Ideal for researchers, hobbyists, or fashion-tech prototypes that need modest GPU budgets and non-commercial output. If you need a production-ready, commercially licensable try-on pipeline, look elsewhere.

Frequently asked

What is Zheng-Chong/CatVTON?: CatVTON swaps garments onto people in photos while keeping trainable parameters at 50 million and inference VRAM under 8 GB.
Is CatVTON open source?: Yes — Zheng-Chong/CatVTON is an open-source project tracked on heatdrop.
What language is CatVTON written in?: Zheng-Chong/CatVTON is primarily written in Python.
How popular is CatVTON?: Zheng-Chong/CatVTON has 1.8k stars on GitHub.
Where can I find CatVTON?: Zheng-Chong/CatVTON is on GitHub at https://github.com/Zheng-Chong/CatVTON.