YuanGongND/ast
A PyTorch implementation of the Audio Spectrogram Transformer (AST) for audio and speech classification tasks.

Velocity · 7d
+0.8
★ / day
Trend
→steady
star history
This repository provides the code for the Interspeech 2021 paper presenting the Audio Spectrogram Transformer, a transformer-based model that processes audio spectrograms for classification. It includes pretrained models and training recipes for benchmark audio datasets including AudioSet, ESC-50, and SpeechCommands. The model can be used for downstream audio classification tasks via transfer learning.