Machine learning via abstract algebra, written in Haskell
A research library that re-derives training algorithms from algebraic laws to get parallel, online, and reversible learning "for free."

What it does
HLearn is a Haskell machine learning library built on a unusual premise: many learning algorithms have hidden algebraic structure. The project exposes that structure—monoids, groups, vector spaces—to derive parallel batch training, online updates, “untraining” of data points, and fast cross-validation without rewriting the core algorithm each time. It also claims the fastest nearest-neighbor search for arbitrary metric spaces.
The interesting bit
The real bet is on the interface, not the algorithm catalog. By encoding training as homomorphisms, HLearn gets compositional properties that normally require manual engineering. The History monad is a neat touch: it threads optimization-debugging information through code with zero runtime overhead and no modifications to the original algorithm.
Key highlights
- Algebraic structures (monoid, group, module, etc.) automatically unlock parallel, online, and reversible training
- History monad for zero-cost debugging instrumentation
- Built atop SubHask, a custom numeric layer for fast array computation in Haskell
- Backed by ICML papers on faster cover trees and algebraic classifiers
- Explicitly research-stage: sparse docs, out-of-date blog posts, and limited algorithm coverage
Caveats
- Documentation is sparse and rapidly outdated; Hackage can’t even build the Haddocks due to an old GHC version
- “Does not implement many of the popular machine learning techniques”—the README’s own admission
- Most blog post code examples are broken against current HEAD
Verdict Worth studying if you design ML frameworks or care about the algebra-of-learning angle. Skip it if you need a practical, batteries-included toolkit today.