← all repositories
anicolson/DeepXi

Speech enhancement by predicting a Greek letter nobody can pronounce

Deep Xi estimates a priori SNR with neural nets so you don't have to do the math yourself.

523 stars MATLAB Domain AppsML Frameworks
DeepXi
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

What it does Deep Xi trains deep neural networks to estimate the a priori signal-to-noise ratio (SNR) from noisy speech magnitude spectra. That estimate then feeds into classic MMSE gain functions for speech enhancement, noise PSD estimation, or as a front-end for robust automatic speech recognition. It’s essentially a learned replacement for the decision-directed approach that speech people have been hand-tuning since the 1980s.

The interesting bit The target isn’t raw SNR—it’s a CDF-mapped version squeezed into [0,1] using statistics computed over the training set, then unmapped at inference. The README is admirably explicit that this improves SGD convergence. Several architectures are supported (MHANet, ResNet, ResLSTM, ResBiLSTM), with causal and non-causal variants, and pre-trained models are actually provided in the repo.

Key highlights

  • TensorFlow 2/Keras implementation with sequence masking for variable-length batches
  • Pre-trained MHANet and ResNet models available in the model/ directory
  • Trained on the dedicated Deep Xi dataset (linked, though you need to fetch it separately)
  • Objective results on DEMAND Voice Bank and a custom test set; non-causal ResNet 1.1n scores highest on most metrics
  • Also usable for ideal binary/ratio mask estimation and as an ASR front-end

Caveats

  • The repo language is tagged MATLAB, but the implementation is TensorFlow 2/Keras—MATLAB appears limited to evaluation scripts (eval_example.m, eval_stats.m)
  • Results in papers used TensorFlow 1; current numbers are TF2/Keras re-runs, and the README notes this explicitly
  • The README is truncated mid-table for the Deep Xi Test Set results, so some numbers are cut off

Verdict Worth a look if you’re doing speech enhancement research and want a well-documented baseline with actual pre-trained weights. Skip if you need a polished, end-to-end product pipeline—the code is research-grade, and you’ll be doing your own data wrangling.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.