← all repositories
ebhy/budgetml

GCP preemptible instances, but make them ML APIs

A Python library that wraps the tedious parts of deploying a model to a cheap, auto-restarting GCP instance with HTTPS and OAuth2.

budgetml
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

What it does

BudgetML deploys a single ML model as a FastAPI inference service on Google Cloud Platform preemptible instances—roughly 80% cheaper than standard VMs. It handles SSL certificates, OAuth2 password protection, and auto-restart on preemption, all behind a few lines of Python. You write a Predictor class with load() and predict() methods, pass it to budgetml.launch(), and get a Swagger UI endpoint.

The interesting bit

The project leans into the trade-off most deployment tools pretend doesn’t exist: production-grade orchestration is overkill for a solo model, but serverless functions get expensive and memory-starved at scale. BudgetML’s angle is embracing the preemptible instance—GCP’s bargain-bin VM that gets shut down every 24 hours—and automating the resurrection loop so downtime stays in the “few minutes” range. It’s a hack dressed up as a library, which is honest about its limits.

Key highlights

  • Automatic FastAPI server and Swagger docs generation from a Python class
  • Built-in Let’s Encrypt SSL via docker-swag; no manual certificate wrangling
  • OAuth2 password/Bearer auth out of the box
  • Auto-restart on preemption with claimed “99% uptime” despite ~24-hour forced shutdowns
  • Cost comparison in README shows $46–$370/month savings vs. standard e2-highmem instances (pricing as of January 2021)

Caveats

  • Not actively maintained: README prominently states the authors are looking for a new maintainer
  • GCP-only, single-cloud: No AWS or Azure equivalents; you’re buying into Google’s preemptible pricing model
  • Not production-grade: Authors explicitly warn against “full-fledged production-ready setup”; serious workloads are directed to ZenML, their sibling project
  • Pricing data is stale: Cost comparison cites January 2021 GCP pricing

Verdict

Good for data scientists who need a cheap, quick HTTPS endpoint for a side project or prototype and already live in GCP. Skip it if you need multi-region reliability, active maintenance, or aren’t comfortable with the preemptible-instance lottery.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.