Judging a face by the jitter of its embeddings
SER-FIQ measures face image quality by asking: how much does this photo's neural embedding wobble when you randomly drop parts of the network?

What it does
SER-FIQ scores how suitable a face photo is for recognition—without needing human-labeled quality data. It runs multiple stochastic forward passes through a face recognition network (with dropout enabled), then measures how much the resulting embeddings vary. Low variation means high quality; high variation means the network is uncertain, so the image is probably blurry, occluded, or otherwise sketchy. The code provides an ArcFace-based demonstration.
The interesting bit
The clever move is tying quality assessment directly to the deployed recognition model itself. Since the quality measure and the recognition share the same network, the quality score captures the exact decision patterns the recognizer actually uses. The authors also flag an uncomfortable corollary: this tight coupling means biases in the face recognition system transfer straight into quality assessment.
Key highlights
- Unsupervised: no hand-labeled quality scores needed
- Same-model approach outperforms six academic and industry baselines in cross-database tests (per the paper)
- Computationally cheap: only ~10% extra GFLOPS if dropout is already in the last layer
- Includes analysis of demographic and non-demographic bias in quality scores
- Non-commercial license (CC BY-NC-SA 4.0)
Caveats
- The provided code is explicitly labeled a “demonstration”; production use requires integrating the concept into your own pipeline
- Requires a dropout-trained recognition network for best results
- Model files must be downloaded separately from Google Drive
Verdict
Worth a look if you’re building or auditing face recognition systems and need a principled, unsupervised quality filter. Skip if you need a drop-in commercial solution or a fully packaged library.