← all repositories
allenai/tango

A workflow engine that admits research is messy

AI2 Tango caches experiment steps so you don't re-run what hasn't changed, without pretending your code is stable enough for production DAG tools.

571 stars Python LLMOps · EvalOther AI
tango
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

Tango is a Python experiment runner from AI2 that breaks research code into decorated @step() functions, then caches their outputs keyed by a hash of inputs plus a manually-bumped VERSION string. You define steps in Python, wire them together in Jsonnet config files, and run via CLI. The second run of an unchanged step pulls from cache instead of re-executing.

The interesting bit

The design explicitly rejects source-code hashing as too brittle for research code that changes constantly. Instead, you manually increment a VERSION class variable when a step’s logic actually changes. It’s a deliberate trade-off: less magic, more transparency, and a tacit admission that your preprocessing code will be rewritten seventeen times before publication.

Key highlights

  • Caching is deterministic based on step inputs and a user-managed VERSION string, not source-code bytes
  • Steps are plain Python functions with a decorator; configs are Jsonnet, not YAML soup
  • Integrations ship separately (torch, wandb, datasets) so you install only what you need
  • Prebuilt Docker images with CUDA variants for GPU workflows
  • CLI includes a tango info diagnostic and plays nicely with pdb

Caveats

  • The README’s quick-start example is trivial; real-world step composition and dependency handling are only covered in external docs
  • Jsonnet as the config format adds a learning curve if your team is all-in on YAML or Python-native configs
  • 571 stars suggests modest adoption; ecosystem maturity relative to Metaflow or Airflow is unclear

Verdict

Worth a look if you’re doing collaborative ML research where code churns daily and production workflow engines feel like overkill. Skip it if you need production monitoring, dynamic task scheduling, or already have a caching layer you trust.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.