← all repositories
PaddlePaddle/Research

Baidu's research attic: 30+ papers, one repo, mostly in Chinese

A grab-bag of PaddlePaddle implementations for CV, NLP, KG, and spatial-temporal mining papers, straight from Baidu's research teams.

Research
Velocity · 7d
+0.8
★ / day
Trend
steady
star history

What it does This is Baidu’s clearinghouse for research code: implementations of conference papers and competition-winning models built on their PaddlePaddle framework. The repo spans four areas—computer vision, NLP, knowledge graphs, and spatial-temporal data mining—with each project living in its own subdirectory. Think of it as a curated flea market rather than a unified toolkit.

The interesting bit The sheer breadth is the point. You get everything from a GNN-based image re-ranking system to a POI valuation algorithm for real-estate intelligence, plus dialogue models like PLATO that handle chit-chat, knowledge Q&A, and task-driven conversation in one go. Many entries are tied to specific competitions (AICITY, RSNA, WebVision) or Baidu’s own benchmarks (DuReader, DuIE, DuEE).

Key highlights

  • CV: 14 projects including re-identification, anomaly detection, medical imaging (diabetic retinopathy, intracranial hemorrhage), and a 20× parameter-efficient Transformer for hemorrhage detection
  • NLP: 22 projects covering machine translation, multi-turn dialogue, text-to-SQL, and dense passage retrieval; heavy use of Baidu’s ERNIE pretraining model
  • Knowledge graphs: 6 projects including CoKE for contextualized KG embeddings and SSAN for document-level relation extraction
  • Spatial-temporal: 3 projects including POI generation and global road-network-based region partitioning
  • Most entries link to arXiv or ACL Anthology papers; some are competition baselines with no paper listed

Caveats

  • Documentation is entirely in Chinese; code comments and READMEs lack English translations
  • No unified install or dependency management—each subproject is its own island
  • Maintenance status is unclear; no recent commits or version tags visible in the provided sources

Verdict Worth a dig if you’re already committed to PaddlePaddle or need a specific Baidu paper’s implementation for reproduction. PyTorch or JAX shops should keep scrolling—there’s no abstraction layer here, just raw research code with a language barrier.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.