Mozilla's bug triage, outsourced to a Python script
A machine learning platform that reads Bugzilla tickets so humans don't have to sort them by hand.

What it does
Bugbug trains classifiers on Mozilla’s bug and commit data to automate the tedious parts of software engineering: assigning bugs to the right developer, detecting regressions, picking which tests to run, and filtering spam. It plugs into Bugzilla and mozilla-central, with some GitHub issue support. Models cover everything from “is this actually a bug?” to “will this patch get backed out?”
The interesting bit
The project treats bug history as a replayable dataset — bug_snapshot.py lets you reconstruct a ticket’s state at any point in time, which matters when you’re training models that need to avoid peeking at the future. The “testselect” classifier also gets at a real infrastructure problem: running fewer tests without missing the one that breaks.
Key highlights
- 18 built-in classifiers, from assignee suggestion to uplift approval
- ~93% accuracy on the defect-vs-feature classifier (2,110 bugs, per README)
- Keras models wrapped to fit scikit-learn pipelines via
bugbug/nn.py - Training hooks into Mozilla’s Taskcluster CI with a PR keyword:
Train on Taskcluster: <model> - Requires Python 3.12+, uses
uvfor dependency sync
Caveats
- Hard-wired for Mozilla’s toolchain: Bugzilla, Mercurial, mozilla-central. GitHub support exists but looks secondary.
- Repository mining takes 7+ hours; the README suggests adding
limit=1024just to test changes. libgit2v1.0.0 dependency is flagged as “might be required” and only in Debian experimental.
Verdict
Worth studying if you run triage or CI for a large project with messy historical data. Skip it if you’re looking for a drop-in solution for a small GitHub repo — the Mozilla-specific assumptions run deep, and the project says so itself.