← all repositories
kubeflow/katib

AutoML that speaks fluent Kubernetes

Katib runs hyperparameter sweeps and neural architecture search as native Kubernetes jobs, so your cluster schedules the science instead of your laptop.

1.7k stars Python ML FrameworksLLMOps · Eval
katib
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does Katib is a Kubernetes-native AutoML system that automates hyperparameter tuning, early stopping, and neural architecture search. It spawns training trials as Kubernetes resources—anything from plain pods to Argo Workflows or Tekton Pipelines—and collects results to drive the next round of experiments. A Python SDK (pip install kubeflow-katib) lets data scientists define search spaces without writing YAML by hand.

The interesting bit The project is deliberately framework-agnostic: it can tune code written in any language, and it delegates the actual optimization brains to established libraries like Optuna, Hyperopt, and Goptuna rather than reinventing them. That makes it less a new algorithm shop and more an orchestration layer for AutoML at cluster scale.

Key highlights

  • Supports 10+ search algorithms including Bayesian optimization, TPE, CMA-ES, HyperBand, and Population Based Training
  • Two neural architecture search methods built in: ENAS and DARTS
  • Early stopping via median stopping rule to cut off dud experiments early
  • Pluggable trial backends: works with Kubeflow Training Operator, Argo, Tekton, or any Kubernetes Custom Resource
  • Standalone install possible with a single kubectl apply -k command; no full Kubeflow stack required

Caveats

  • The README is heavy on “follow the official documentation” links; concrete setup details live outside the repo
  • Early stopping coverage looks thin—only median stopping is listed, with the rest of the column left blank

Verdict Worth a look if you’re already running Kubernetes and want to move hyperparameter search off your workstation into scheduled, fault-tolerant cluster jobs. Skip it if you’re looking for a lightweight local AutoML tool or need extensive early stopping strategies out of the box.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.