Yes — jtkim-kaist/VAD is an open-source project tracked on heatdrop.

What language is VAD written in?

jtkim-kaist/VAD is primarily written in MATLAB.

jtkim-kaist/VAD has 869 stars on GitHub.

Where can I find VAD?

jtkim-kaist/VAD is on GitHub at https://github.com/jtkim-kaist/VAD.

jtkim-kaist/VAD

A KAIST VAD toolkit that still thinks it's 2017

MATLAB meets TensorFlow 1.x in a voice activity detection research artifact that ships its own noisy Korean street recordings.

★869 stars MATLAB Image · Video · Audio Domain Apps

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

This is a research toolkit for voice activity detection — figuring out when someone is actually speaking in an audio stream. It bundles four neural classifiers (DNN, bDNN, LSTM, and an attention-based ACAM model), a custom multi-resolution cochleagram feature extractor, and two hours of real-world recordings from bus stops, construction sites, parks, and rooms around KAIST. The whole pipeline is orchestrated through MATLAB, with the actual neural networks implemented in Python using TensorFlow 1.1–1.3.

The interesting bit

The ACAM model adapts ideas from visual attention (the “recurrent attention model” for image recognition) to the audio domain — a neat cross-modal transplant that the authors published in IEEE Signal Processing Letters. The bundled dataset is genuinely unusual: recorded on a Samsung Galaxy S8 in actual Korean environments, complete with crying babies, insect chirps, and mouse clicks as “bonus” noise sources.

Key highlights

Four classifier architectures in one toolkit, all using the same MRCG frontend
Includes 120 minutes of annotated real-world speech with ground-truth labels (bus stop SNR: 5.6 dB, construction site: 2.05 dB — properly miserable)
Post-processing parameters exposed for tuning specific error types (false entrance/exit, missed speech, over-segmentation)
Python reimplementation available in a separate branch
Presented at ICASSP 2019

Caveats

MATLAB 2017b dependency, explicitly noted as “will be depreciated” since at least 2018
TensorFlow 1.x requirement — you’ll need to resurrect old Python environments
MRCG feature extraction is flagged by the authors themselves as “somewhat long”; a TODO to replace it with spectrograms has sat unresolved for years
16 kHz sampling rate is mandatory; no resampling convenience provided

Verdict

Worth a look if you’re reproducing the ACAM paper or need a small, messy real-world VAD dataset for benchmarking. Everyone else should probably start with something that doesn’t require a time-traveling Python environment.

Frequently asked

What is jtkim-kaist/VAD?: MATLAB meets TensorFlow 1.x in a voice activity detection research artifact that ships its own noisy Korean street recordings.
Is VAD open source?: Yes — jtkim-kaist/VAD is an open-source project tracked on heatdrop.
What language is VAD written in?: jtkim-kaist/VAD is primarily written in MATLAB.
How popular is VAD?: jtkim-kaist/VAD has 869 stars on GitHub.
Where can I find VAD?: jtkim-kaist/VAD is on GitHub at https://github.com/jtkim-kaist/VAD.