Skip to main content

Table 2 Publication numbers of 12 datasets

From: Developing a healthcare dataset information resource (DIR) based on Semantic Web

Dataset

# of PDF-format articles in PMC

# for method extraction after preprocessing

# that analyzing datasets

NHANES

37,485

16,213

10,674

SEER-Medicare

2569

2276

1627

Add Health

1881

1477

1028

HCUP

1785

1398

993

MDS

1337

1053

584

CPRD

1014

735

477

MarketScan

985

920

614

THIN

733

678

434

MIMIC

237

206

152

Premier

165

158

95

Clinformatics

69

65

49

Humedica

22

22

9

Total

48,282

25,201

16,736