Is VQGAN-CLIP open source?

Yes — nerdyrodent/VQGAN-CLIP is an open-source project tracked on heatdrop.

What language is VQGAN-CLIP written in?

nerdyrodent/VQGAN-CLIP is primarily written in Python.

How popular is VQGAN-CLIP?

nerdyrodent/VQGAN-CLIP has 2.6k stars on GitHub.

Where can I find VQGAN-CLIP?

nerdyrodent/VQGAN-CLIP is on GitHub at https://github.com/nerdyrodent/VQGAN-CLIP.

← all repositories

nerdyrodent/VQGAN-CLIP

VQGAN+CLIP breaks out of Colab and onto your GPU

It exists to run Katherine Crowson’s VQGAN+CLIP text-to-image pipeline locally, freeing it from the browser tab.

★2.6k stars Python Image · Video · Audio

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

generate.py turns text prompts into images by marrying OpenAI’s CLIP model with a VQGAN decoder, all without calling out to Google Colab. You can weight multiple prompts, seed the process with an existing image, or chain prompts in a “story mode” that morphs the output over time. It also supports feedback loops—zooming and rotating rendered frames before pushing them back in—to create warped video sequences.

The interesting bit

The standout feature is a shell-script feedback loop that physically zooms and rotates each frame with ImageMagick before feeding it back into the generator, creating a hypnotic, self-referential video effect without touching a video codec until the end. It is essentially a personal escape hatch from Colab—tested specifically on an RTX 3090 and Ubuntu 20.04—rather than a polished framework.

Key highlights

Runs fully offline once models and dependencies are present, with a CPU fallback if no GPU is detected.
Supports weighted text prompts, image prompts, and sequential “story mode” prompt chaining.
Can apply text-driven style transfer to existing images or entire directories of video frames.
VRAM appetite scales aggressively: 24 GB for 900×900 output, 10 GB for 512×512.
Includes a Replicate demo and Docker image for those who want to skip local dependency wrangling.

Caveats

AMD ROCm support is documented but explicitly untested.
The setup requires manually cloning CLIP and taming-transformers, plus downloading separate VQGAN checkpoints—it is essentially a local wrapper around existing research code.
CPU rendering is supported, but the README only lists GPU VRAM requirements, so performance on CPU is unclear.

Verdict

Grab this if you want to experiment with VQGAN+CLIP on your own Linux box and don’t mind wrangling PyTorch dependencies and multi-gigabyte model checkpoints. Skip it if you are on untested hardware, lack VRAM, or want a batteries-included installer—this is a research-grade sketchpad, not a consumer app.

Frequently asked

What is nerdyrodent/VQGAN-CLIP?: It exists to run Katherine Crowson’s VQGAN+CLIP text-to-image pipeline locally, freeing it from the browser tab.
Is VQGAN-CLIP open source?: Yes — nerdyrodent/VQGAN-CLIP is an open-source project tracked on heatdrop.
What language is VQGAN-CLIP written in?: nerdyrodent/VQGAN-CLIP is primarily written in Python.
How popular is VQGAN-CLIP?: nerdyrodent/VQGAN-CLIP has 2.6k stars on GitHub.
Where can I find VQGAN-CLIP?: nerdyrodent/VQGAN-CLIP is on GitHub at https://github.com/nerdyrodent/VQGAN-CLIP.