Skip to main content

Table 2 Publication numbers of 12 datasets

From: Developing a healthcare dataset information resource (DIR) based on Semantic Web

Dataset # of PDF-format articles in PMC # for method extraction after preprocessing # that analyzing datasets
NHANES 37,485 16,213 10,674
SEER-Medicare 2569 2276 1627
Add Health 1881 1477 1028
HCUP 1785 1398 993
MDS 1337 1053 584
CPRD 1014 735 477
MarketScan 985 920 614
THIN 733 678 434
MIMIC 237 206 152
Premier 165 158 95
Clinformatics 69 65 49
Humedica 22 22 9
Total 48,282 25,201 16,736