Anomaly detection by asking RNNs to predict the future
A two-stage PyTorch pipeline that flags time-series outliers by measuring how badly an RNN's multi-step predictions go off the rails.

What it does
Trains a stacked RNN to recursively predict the next several steps of a time series, then fits a multivariate Gaussian on the prediction errors to produce anomaly scores. If the model’s forecast drifts far from reality, the timestamp gets flagged. The repo bundles six classic benchmark datasets—NYC taxi rides, ECGs, power demand, respiration, space-shuttle valve readings, and 2D hand gestures—plus scripts to train and evaluate end-to-end.
The interesting bit
The trick is in the training objective: the author notes that vanilla recursive prediction accumulates error and collapses, so the model is explicitly trained to be “robust to input noise.” Exactly how this is achieved is left as a series of TODOs in the README, which is either a teaser or an admission that the write-up stalled.
Key highlights
- Two-stage design: prediction first, anomaly scoring second via Gaussian likelihood
- Six ready-to-run datasets from the UCR time-series archive and NYC TLC
- Includes precision/recall/F1 evaluation curves (only ECG results shown in README)
- Bash scripts for batch training and detection across all datasets
- PyTorch 0.4.0+ implementation, Python 3.5+
Caveats
- README has multiple “TODO” sections, including the full explanation of the robust training losses
- Windows 10 support is explicitly broken per an open issue; Ubuntu 16.04+ only
- Dataset labels are unofficial and unverified—the authors warn they may contain false positives/negatives and lack domain-expert validation
Verdict
Worth a look if you need a quick, reproducible baseline for time-series anomaly detection on classic benchmarks. Skip it if you need production-grade code, modern PyTorch, or guaranteed-accurate labels.