← all repositories
mlachmish/MusicGenreClassification

A grad student, 1% of a million songs, and a CNN walk into a bar

A 2016 Tel Aviv University project that swaps Tao Feng's RBM for a TensorFlow CNN and scrapes 30-second previews to classify ten music genres.

MusicGenreClassification
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does

Trains a convolutional neural network to classify 10-second audio clips into one of ten genres (blues, classical, country, disco, hiphop, jazz, metal, pop, reggae, rock). The pipeline downloads 30-second song previews via the 7Digital API, converts them to mel-frequency spectrograms using librosa, and feeds a 3-layer CNN with max pooling and softmax output.

The interesting bit

The author couldn’t find a clean labeled dataset, so he reverse-engineered one: the Million Song Dataset provides metadata and 7Digital IDs, and 7Digital’s “preview before you buy” feature becomes a free data spigot. He also discovered that raw mel-frequencies (step 2 of the MFCC pipeline) outperform full MFCCs by “extremely better” margins, at the cost of longer training — a finding backed by t-SNE visualizations showing cleaner genre clustering.

Key highlights

  • Built on TensorFlow in 2016, explicitly framed as a learning exercise for the then-new framework
  • CNN architecture with 3 hidden layers and max pooling, inspired by Sander Dieleman’s Spotify blog post on deep content-based recommendation
  • Dataset construction via previewDownloader.py scraping 7Digital previews for ~1% of the Million Song Dataset (roughly 2.8GB due to laptop constraints)
  • Preprocessing scripts for MFCC and mel-spectrogram extraction, t-SNE visualization, and input formatting all included
  • Results published for full 10-class classification, compared against Tao Feng’s 2/3/4-class RBM results and a non-deep-learning benchmark

Caveats

  • The README contains no actual accuracy numbers in text; results are only visible in the results_mine.png image, so precise performance is unclear without viewing it
  • Dataset download link is a Dropbox URL of uncertain longevity
  • Spelling inconsistencies (“nural_network.png”, “GenereClassification”) suggest limited maintenance since original publication

Verdict

Worth a look if you’re teaching or learning classic audio CNN pipelines, or if you need a reference for scraping creative datasets from commercial APIs. Skip it if you want a maintained, production-ready classifier — this is academic coursework from the TensorFlow 0.x era, not a library.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.