← all repositories
sepandhaghighi/pycm

A confusion matrix that actually explains itself

PyCM computes dozens of classification metrics automatically, so you don't have to remember which F-score is which.

1.5k stars Python LLMOps · Eval
pycm
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does

PyCM takes your actual and predicted class labels (or a raw confusion matrix) and spits out a comprehensive statistical report: per-class accuracy, F1, AUC, Kappa, and a pile of other metrics most people look up on Wikipedia. It also prints normalized matrices and can relabel classes, apply thresholds, or load saved matrices from disk.

The interesting bit

The library doesn’t just dump numbers—it includes interpretation helpers like “AUCI” (AUC value interpretation: Very Good, Fair, Poor) and the Landis & Koch strength-of-agreement scale for Kappa. It’s the kind of pedantic thoroughness your peer reviewers will appreciate.

Key highlights

  • Accepts both vectors and direct matrix input, plus sample weights and activation thresholds
  • Supports saving/loading via .obj files and can transpose direct matrices
  • Works in MATLAB via Python interpreter bridge (yes, really)
  • Available on PyPI, Conda, and as source; current version is 4.6
  • Plotting requires Matplotlib ≥3.0.0 or Seaborn ≥0.9.1

Caveats

  • Python 3.6 support ended at version 4.3; you’ll need 3.7+ for current releases
  • Plotting is an optional dependency, not built-in

Verdict

Grab this if you evaluate classifiers regularly and are tired of reimplementing Cohen’s Kappa. Skip it if you’re already happy with scikit-learn’s classification report and don’t need the extra statistical depth.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.