← all repositories
tensorflow/model-analysis

TensorFlow model evaluation at scale, with a side of Jupyter

TFMA lets you slice, dice, and visualize model metrics across massive datasets without rewriting your training evaluation code.

1.3k stars Python LLMOps · EvalML Frameworks
model-analysis
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

TensorFlow Model Analysis (TFMA) evaluates TensorFlow models on large datasets using the same metrics you defined during training. It runs distributed via Apache Beam, computes metrics over data slices, and renders results in Jupyter notebooks. Think of it as your training-time evaluation logic, but pointed at held-out data and broken down by feature slices.

The interesting bit

The slicing is where the value hides. A model that looks fine in aggregate can fail badly on specific subgroups—TFMA surfaces that. It also uses Apache Arrow internally to feed vectorized numpy operations, which is a pragmatic choice for performance without leaving the Python ecosystem.

Key highlights

  • Reuses metrics from training; no second implementation to drift out of sync
  • Distributed evaluation via Apache Beam (local by default, Dataflow or other runners optional)
  • Built-in Jupyter/JupyterLab visualization with interactive slicing
  • Can export standalone HTML reports via embed_minimal_html
  • Kubeflow Pipelines integration for embedding visualizations in pipeline UIs

Caveats

  • Pre-1.0: backwards-incompatible changes are explicitly warned about
  • Dependency matrix is strict; the README includes a long compatibility table you’ll need to consult
  • JupyterLab setup is finicky—version-matching required across pip packages, npm labextensions, and jupyter-widgets
  • TensorFlow must be installed separately; not an explicit pip dependency

Verdict

Worth a look if you’re already in the TFX/TensorFlow ecosystem and need production-scale evaluation with slice-aware debugging. Skip it if you’re using PyTorch or want lightweight, dependency-light model analysis.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.