Word2Vec and friends, written in Go from scratch
A Go-native toolkit for training word embeddings when you don't want to leave the gopher ecosystem.

What it does
wego implements three classic word embedding algorithms—Word2Vec, GloVe, and LexVec—directly in Go, with no Python bindings or C extensions required. You feed it a space-separated text corpus, it trains vectors, and you get a text file of word-to-vector mappings plus CLI tools to query them.
The interesting bit
The project leans into Go’s strengths rather than fighting them: HogWild! asynchronous updates mean training is nondeterministic between runs, but you skip the locking overhead. There’s also a REPL console for doing vector arithmetic (King − Man + Woman ≈ Queen) without leaving the terminal.
Key highlights
- Three models: Word2Vec (CBOW and Skip-gram), GloVe, and LexVec
- CLI for training, querying nearest neighbors, and interactive vector math
- Go SDK with functional options for hyperparameters
- Outputs standard text format compatible with other embedding tools
- Inspired by chewxy’s “Data Science in Go” talk
Caveats
- Training is nondeterministic by design (HogWild! algorithm), so reproducibility requires fixed seeds or multiple runs
- Input format is strict: space-separated tokens only, no sentence boundaries or preprocessing built in
- 506 stars suggests a niche audience; ecosystem maturity lags behind Python’s gensim or spaCy
Verdict
Worth a look if you’re building Go-native NLP pipelines and want to avoid Python interop. Skip it if you need production-grade preprocessing, deterministic training, or the broader model zoo that Python ecosystems provide.