← all repositories
igrigorik/decisiontree

Ruby decision trees: old-school ML without the PyTorch bloat

A no-dependency Ruby library for ID3 decision tree learning that handles both discrete and continuous data.

1.5k stars Ruby ML Frameworks
decisiontree
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

Implements the classic ID3 algorithm for building decision trees, with support for both discrete categories and continuous numeric thresholds. You train it on labeled data, then ask it to classify new inputs. It also includes a Ruleset class that does train/test splitting and rule pruning in the style of C4.5, plus a Bagging wrapper that trains 10 rulesets and votes on the outcome.

The interesting bit

The continuous variable handling is the clever part: instead of pre-binning your data, it evaluates every possible threshold between observed values and picks the split with maximum information gain. This gives you proper binary trees partitioned at precise cut points (e.g., temperature > 20°C) rather than arbitrary buckets. There’s also built-in Graphviz export for visualizing the learned tree structure.

Key highlights

  • Pure Ruby, no external ML dependencies
  • Handles mixed discrete/continuous attributes in the same tree
  • Graphviz visualization for discrete models
  • Ruleset pruning and bagging ensemble methods included
  • Falls back to a default value when input doesn’t match any branch

Caveats

  • README is sparse on performance characteristics; no benchmarks or complexity notes
  • Graphviz integration relies on an external graphr subproject that appears unmaintained (SourceForge link)
  • The bagging implementation trains exactly 10 rulesets with no apparent configurability

Verdict

Useful if you’re doing lightweight classification in a Ruby codebase and don’t want to pull in Python or a heavy framework. Skip it if you need modern algorithms (no random forests, no gradient boosting) or if you’re working at scale—this is textbook 1986 Quinlan, not production ML infrastructure.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.