Is Multimodal-Emotion-Recognition open source?

Yes — maelfabien/Multimodal-Emotion-Recognition is open source, released under the Apache-2.0 license.

What language is Multimodal-Emotion-Recognition written in?

maelfabien/Multimodal-Emotion-Recognition is primarily written in Jupyter Notebook.

How popular is Multimodal-Emotion-Recognition?

maelfabien/Multimodal-Emotion-Recognition has 1.1k stars on GitHub.

Where can I find Multimodal-Emotion-Recognition?

maelfabien/Multimodal-Emotion-Recognition is on GitHub at https://github.com/maelfabien/Multimodal-Emotion-Recognition.

← all repositories

maelfabien/Multimodal-Emotion-Recognition

Job-interview emotion decoder reads your face, voice, and words

A French Employment Agency partnership produced a Flask app that scores candidate emotions from video, audio, and text simultaneously.

★1.1k stars Jupyter Notebook Domain Apps Computer Vision

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

This project is a real-time web app that ingests job-candidate interviews through three channels—facial video, vocal audio, and written text—and runs each through a separate deep-learning pipeline. The results are fused into an ensemble prediction served via a Flask interface. It was built in partnership with the French Employment Agency, which tells you exactly where they expected the most value: high-volume hiring.

The interesting bit

The audio pipeline uses a Time-Distributed CNN that slides a rolling window across log-mel-spectrograms, feeding local features into LSTMs for temporal context. It is the kind of architecture that sounds over-engineered until you remember recruiters do not review spectrograms by hand.

Key highlights

Three independent modalities: text (CNN-LSTM with Word2Vec embeddings), audio (time-distributed CNN-LSTM on spectrograms), video (FER2013-trained facial model)
Pre-trained weights and processed datasets provided via Google Drive for each modality
Colab notebooks available for audio and video pipelines
Flask web app with live webcam or file upload support
Academic paper linked (Overleaf, read-only)

Caveats

The README is enthusiastic but thin on ensemble mechanics and actual accuracy numbers; the “Accuracy_Speech” chart exists but no value is quoted
FER2013 is explicitly called “quite challenging” with empty and mislabeled images
The web app code lives in a separate repository, not this one

Verdict

Worth a look if you are building multimodal classifiers and want a worked example of late-fusion architecture with downloadable weights. Skip it if you need production-grade inference code or reproducible benchmarks.

Frequently asked

What is maelfabien/Multimodal-Emotion-Recognition?: A French Employment Agency partnership produced a Flask app that scores candidate emotions from video, audio, and text simultaneously.
Is Multimodal-Emotion-Recognition open source?: Yes — maelfabien/Multimodal-Emotion-Recognition is open source, released under the Apache-2.0 license.
What language is Multimodal-Emotion-Recognition written in?: maelfabien/Multimodal-Emotion-Recognition is primarily written in Jupyter Notebook.
How popular is Multimodal-Emotion-Recognition?: maelfabien/Multimodal-Emotion-Recognition has 1.1k stars on GitHub.
Where can I find Multimodal-Emotion-Recognition?: maelfabien/Multimodal-Emotion-Recognition is on GitHub at https://github.com/maelfabien/Multimodal-Emotion-Recognition.