DIY genre tagger for your hoarded MP3s
A personal pipeline that turns your own music library into training data, because who trusts crowdsourced metadata anyway?

What it does Takes your personal MP3 collection, slices the audio into chunks, and trains a neural network to classify genre. The end goal: automatically fill in missing genre tags for songs you’ve already labeled by hand. It’s a three-step CLI workflow: slice, train, test.
The interesting bit The project treats your existing library as ground truth — a clever hack around the usual problem of finding clean, labeled datasets. The author also openly admits the final “label new songs” step is left as an exercise, which is either refreshing honesty or a slightly unfinished weekend project, depending on your generosity.
Key highlights
- Built on TensorFlow + tflearn (this was 2016-era tooling, so expect some archaeology)
- Uses sox for audio slicing and eyed3 for MP3 metadata manipulation
- Config-driven: tweak
config.pyfor parameters,model.pyfor architecture swaps - Includes a Medium article walking through the full approach
- ~1.1k stars suggests it hit a nerve with developers who’ve stared at their “Unknown Genre” folders in despair
Caveats
- The “label new songs” pipeline is explicitly not implemented — you’ll wire that yourself with eyed3
- Dependencies include tflearn, which is effectively unmaintained; this is not a modern stack
- No mention of accuracy numbers, dataset size requirements, or training time in the README
Verdict Good for someone who wants to understand a complete (if dated) audio ML pipeline and doesn’t mind doing the last mile of integration. Skip it if you need production-ready inference or current frameworks — this is a learning project with rough edges left deliberately exposed.