Is Sentence-VAE open source?

Yes — timbmg/Sentence-VAE is an open-source project tracked on heatdrop.

What language is Sentence-VAE written in?

timbmg/Sentence-VAE is primarily written in Python.

How popular is Sentence-VAE?

timbmg/Sentence-VAE has 593 stars on GitHub.

Where can I find Sentence-VAE?

timbmg/Sentence-VAE is on GitHub at https://github.com/timbmg/Sentence-VAE.

← all repositories

timbmg/Sentence-VAE

Teaching neural nets to write like Wall Street Journal circa 1995

A clean PyTorch redo of the 2015 paper that first squeezed sentences through a continuous latent space.

★593 stars Python Language Models ML Frameworks

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does This repo re-implements Bowman et al.’s Sentence-VAE, which trains an autoencoder to compress English sentences into a smooth Gaussian latent space and decode them back. You can sample new sentences, or interpolate between two sentence embeddings and watch the grammar morph gradually. It runs on the Penn Tree Bank dataset and tracks ELBO, NLL, and KL divergence through training.

The interesting bit The “n n n n n” in the samples isn’t a bug—it’s the model’s honest admission that it hasn’t learned those words yet. The interpolation results are more telling: the latent space actually preserves some grammatical structure, morphing from “the company said…” to “they were n’t paid” through plausible (if stilted) intermediate steps.

Key highlights

Clean PyTorch re-implementation with RNN and GRU support; LSTM notably absent
KL annealing (logistic or linear) to prevent the latent space from collapsing early in training
Word dropout and embedding dropout on the decoder input for regularization
TensorBoard logging and checkpointing built in
Includes dowloaddata.sh script (sic—typo preserved from upstream) to fetch PTB data

Caveats

Training stopped after just 4 epochs; reported ELBO was only properly optimized for ~1 epoch
Samples are heavily degraded with <unk> tokens (n in the output), suggesting the vocabulary cutoff or training duration is stingy
No LSTM support despite the original paper using it; this may limit reproducibility

Verdict Useful if you need a minimal, hackable VAE-for-text baseline in modern PyTorch. Skip it if you want production-quality generation or faithful reproduction of the 2015 results.

Frequently asked

What is timbmg/Sentence-VAE?: A clean PyTorch redo of the 2015 paper that first squeezed sentences through a continuous latent space.
Is Sentence-VAE open source?: Yes — timbmg/Sentence-VAE is an open-source project tracked on heatdrop.
What language is Sentence-VAE written in?: timbmg/Sentence-VAE is primarily written in Python.
How popular is Sentence-VAE?: timbmg/Sentence-VAE has 593 stars on GitHub.
Where can I find Sentence-VAE?: timbmg/Sentence-VAE is on GitHub at https://github.com/timbmg/Sentence-VAE.