← all repositories
ritheshkumar95/pytorch-vqvae

A course project that outperformed the textbook implementation

Student reimplementation of VQ-VAE with a custom vector quantization layer that ran 3× faster than the naive approach.

pytorch-vqvae
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

Implements Vector Quantized Variational AutoEncoders (VQ-VAE) in PyTorch, plus a PixelCNN prior for generating class-conditional samples. Trains on standard vision benchmarks—MNIST, FashionMNIST, CIFAR-10, Mini-ImageNet—and even Atari Boxing video frames on a separate branch.

The interesting bit

The authors wrote their own vector quantization function and got a nearly 3× speedup over the simpler implementation. They left the slower version in the git history, which is the kind of honest benchmarking you don’t always see in research code. They also added pytest tests for the quantization layer—unusual diligence for a course project.

Key highlights

  • Custom fast vector quantization layer with tested fallback
  • PixelCNN prior trained on learned discrete latents for sampling
  • Supports both image and video (Atari) domains
  • Includes reconstruction and sample visualizations for MNIST and FashionMNIST
  • Academic report included as final_project.pdf

Caveats

  • README is sparse on architecture details; you’ll need to read the report or code for hyperparameters
  • Video experiments live on a separate branch (evan/video), not main
  • 955 stars but last meaningful activity appears to be the original course project timeline

Verdict

Worth a look if you’re implementing VQ-VAE from scratch and want a clean PyTorch reference with a fast quantization trick. Skip if you need a maintained library with bells and whistles—this is coursework, not a framework.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.