← all repositories
replicate/keepsake

ML experiment tracking that lets you commit to Git after the fact

Keepsake versions your training runs on S3/GCS so you can stop spreadsheet-juggling and actually reproduce results later.

1.7k stars Python LLMOps · EvalOther AI
keepsake
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

What it does

Keepsake is a Python library that snapshots your training code, hyperparameters, model weights, metrics, and even Python dependencies to Amazon S3 or Google Cloud Storage. Two lines in your training loop — keepsake.init() and experiment.checkpoint() — and it handles the rest. You query everything back via CLI or notebook.

The interesting bit

The “commit to Git after the fact” workflow is genuinely clever. You don’t need clean Git history during messy experimentation; Keepsake lets you checkout any checkpoint’s exact code and weights once you’ve found something worth keeping. It’s version control with the safety net finally on the right side of the tightrope.

Key highlights

  • Stores everything as plain files on your own S3/GCS bucket — no server to run, no vendor lock-in
  • CLI supports filtering experiments (--filter "val_loss<0.2") and diffing checkpoints down to dependency versions
  • Notebook integration for retrieval, analysis, and plotting — described as a “programmable Tensorboard”
  • Framework-agnostic: works with PyTorch, TensorFlow, scikit-learn, XGBoost, or anything that saves files
  • Can load production models directly from stored experiments with full provenance

Caveats

  • Not actively maintained — the README opens with a call for maintainers (issue #873)
  • Cloud storage only (S3 or GCS); no local-first or other backend options are mentioned
  • The “works with everything” claim is technically true but also means it’s just file-and-dict storage — you’re not getting framework-native integrations

Verdict

Worth a look if you’re currently duct-taping shell scripts to track ML experiments and want something open-source with your own storage. Skip it if you need active maintenance guarantees or deeply integrated experiment orchestration — this is a project that needs community help to survive.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.