GCP preemptible instances, but make them ML APIs
A Python library that wraps the tedious parts of deploying a model to a cheap, auto-restarting GCP instance with HTTPS and OAuth2.

What it does
BudgetML deploys a single ML model as a FastAPI inference service on Google Cloud Platform preemptible instances—roughly 80% cheaper than standard VMs. It handles SSL certificates, OAuth2 password protection, and auto-restart on preemption, all behind a few lines of Python. You write a Predictor class with load() and predict() methods, pass it to budgetml.launch(), and get a Swagger UI endpoint.
The interesting bit
The project leans into the trade-off most deployment tools pretend doesn’t exist: production-grade orchestration is overkill for a solo model, but serverless functions get expensive and memory-starved at scale. BudgetML’s angle is embracing the preemptible instance—GCP’s bargain-bin VM that gets shut down every 24 hours—and automating the resurrection loop so downtime stays in the “few minutes” range. It’s a hack dressed up as a library, which is honest about its limits.
Key highlights
- Automatic FastAPI server and Swagger docs generation from a Python class
- Built-in Let’s Encrypt SSL via docker-swag; no manual certificate wrangling
- OAuth2 password/Bearer auth out of the box
- Auto-restart on preemption with claimed “99% uptime” despite ~24-hour forced shutdowns
- Cost comparison in README shows $46–$370/month savings vs. standard
e2-highmeminstances (pricing as of January 2021)
Caveats
- Not actively maintained: README prominently states the authors are looking for a new maintainer
- GCP-only, single-cloud: No AWS or Azure equivalents; you’re buying into Google’s preemptible pricing model
- Not production-grade: Authors explicitly warn against “full-fledged production-ready setup”; serious workloads are directed to ZenML, their sibling project
- Pricing data is stale: Cost comparison cites January 2021 GCP pricing
Verdict
Good for data scientists who need a cheap, quick HTTPS endpoint for a side project or prototype and already live in GCP. Skip it if you need multi-region reliability, active maintenance, or aren’t comfortable with the preemptible-instance lottery.