← all repositories
YatingMusic/remi

Teaching Transformers to Feel the Beat

A MIDI token format that gives language models a sense of musical time, so they can generate structured pop piano instead of aimless note soup.

remi
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

REMI (REvamped MIDI-derived events) is a token representation that turns MIDI scores into discrete sequences with explicit metrical structure—beats, bars, tempo changes, and chord labels included. The authors train a Transformer-XL on this format to generate minute-long pop piano pieces with coherent rhythm and harmony, no post-processing required. You can prompt it with a MIDI file for continuation, or generate from scratch with control over local tempo and chord progression.

The interesting bit

The cleverness is in the encoding, not the architecture. Standard MIDI-like tokenizations dump note events as a flat stream; REMI interleaves beat and bar markers so the model learns when things happen, not just what. It’s a data-format hack that solves a musical structure problem without touching the transformer itself.

Key highlights

  • Two pre-trained checkpoints available (~430 MB each): one with tempo control, one with tempo + chord conditioning
  • 775 training MIDI files and 100 evaluation prompts provided for continuation experiments
  • Interactive web demo exists (built by a contributor, not the authors)
  • Sampling parameters (temperature, topk) are exposed and acknowledged as critical to output quality
  • midi2remi.ipynb shows the conversion pipeline

Caveats

  • Locked to TensorFlow 1.14.0, which is well past end-of-life; getting this running on modern CUDA is archaeology
  • Audio synthesis is explicitly punted to external DAWs or FluidSynth, with a known bug around tempo changes
  • Fine-tuning on personal data is possible but undocumented beyond a GitHub issue thread

Verdict

Worth a look if you’re researching symbolic music generation or need a baseline for pop piano generation with structural control. Skip it if you want a maintained, modern framework—this is a 2020 research artifact with 2020 dependencies.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.