Baidu's research attic: 30+ papers, one repo, mostly in Chinese
A grab-bag of PaddlePaddle implementations for CV, NLP, KG, and spatial-temporal mining papers, straight from Baidu's research teams.

What it does This is Baidu’s clearinghouse for research code: implementations of conference papers and competition-winning models built on their PaddlePaddle framework. The repo spans four areas—computer vision, NLP, knowledge graphs, and spatial-temporal data mining—with each project living in its own subdirectory. Think of it as a curated flea market rather than a unified toolkit.
The interesting bit The sheer breadth is the point. You get everything from a GNN-based image re-ranking system to a POI valuation algorithm for real-estate intelligence, plus dialogue models like PLATO that handle chit-chat, knowledge Q&A, and task-driven conversation in one go. Many entries are tied to specific competitions (AICITY, RSNA, WebVision) or Baidu’s own benchmarks (DuReader, DuIE, DuEE).
Key highlights
- CV: 14 projects including re-identification, anomaly detection, medical imaging (diabetic retinopathy, intracranial hemorrhage), and a 20× parameter-efficient Transformer for hemorrhage detection
- NLP: 22 projects covering machine translation, multi-turn dialogue, text-to-SQL, and dense passage retrieval; heavy use of Baidu’s ERNIE pretraining model
- Knowledge graphs: 6 projects including CoKE for contextualized KG embeddings and SSAN for document-level relation extraction
- Spatial-temporal: 3 projects including POI generation and global road-network-based region partitioning
- Most entries link to arXiv or ACL Anthology papers; some are competition baselines with no paper listed
Caveats
- Documentation is entirely in Chinese; code comments and READMEs lack English translations
- No unified install or dependency management—each subproject is its own island
- Maintenance status is unclear; no recent commits or version tags visible in the provided sources
Verdict Worth a dig if you’re already committed to PaddlePaddle or need a specific Baidu paper’s implementation for reproduction. PyTorch or JAX shops should keep scrolling—there’s no abstraction layer here, just raw research code with a language barrier.