abachaa/MedQuAD
Medical question-answering dataset of 47,457 QA pairs sourced from NIH websites for training and evaluating NLP/IR systems.

Velocity · 7d
+0.2
★ / day
Trend
→steady
star history
MedQuAD is a medical question-answering collection covering 37 question types across diseases, drugs, and other medical entities. The dataset includes rich annotations such as UMLS concept identifiers, semantic types, question focus categories, and synonyms. A test collection with 2,479 manually judged answers is provided for benchmarking IR and QA system performance.