← all repositories
JustGlowing/minisom

Self-organizing maps without the framework bloat

A NumPy-only SOM implementation that trades kitchen-sink features for code you can actually read and extend.

1.6k stars Python ML Frameworks
minisom
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

MiniSom trains Self-Organizing Maps—Kohonen networks that squish high-dimensional data onto a 2-D grid so you can eyeball clusters and outliers. It takes NumPy matrices, runs standard or batch training, and hands back neuron coordinates for your samples. Optional Numba JIT acceleration is available via train_batch_offline_fast if you install the [fast] extras.

The interesting bit

The whole thing leans on NumPy alone—no TensorFlow, no PyTorch, no scikit-learn dependency tree. That deliberate minimalism is the selling point: the author explicitly targets researchers who need to hack on the algorithm and students who need to see how SOMs actually work under the hood. The code is vectorized throughout, and the project has accumulated 400+ academic citations, so it has escaped the “toy example” gravity well.

Key highlights

  • Single dependency: NumPy (Numba optional for JIT)
  • Supports hexagonal and rectangular topologies
  • Pickle-based model serialization (with a documented lambda caveat)
  • 97.76% test coverage per the repo badge
  • Extensive example gallery: color quantization, handwritten digits, outlier detection, TSP solving
  • Google Colab notebook for immediate experimentation

Caveats

  • Pickle serialization breaks if you use lambda functions for decay factors
  • The “fast” path requires a separate install and only covers batch offline training
  • No native GPU support; Numba JIT is CPU-only

Verdict

Grab this if you need to teach, tweak, or publish with SOMs and want to avoid dragging in a full ML framework. Skip it if you need distributed training, automatic hyperparameter search, or production pipelines that demand sklearn-compatible APIs.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.