Skip to main content

Table 1 A list of the published datasets used in this study

From: A random forest based biomarker discovery and power analysis framework for diagnostics research

RF mode

Dataset type

Sample number (N)

Feature number (p)

Outcome variable

Pubmed ID

References

Regression

Metabolomics

73

196

Relative liver weight

28,185,575

[21]

Lipidomics

40

219

infant milk amount

28,190,990

[35]

Classification

Metabolomics

73

196

Relative liver weight class (below or above the mean value)

28,185,575

[21]

Transcriptomics

68

414

Colorectal cancer (CRC) stages

27,176,004

[36]

Transcriptomics

20

1386

Primary Sclerosing Cholangitis (PSC) vs. Ulcerative colitis (UC)

32,016,358

[37]

Transcriptomics

40

25,697

OB/OB vs. wild type genotype mouse

32,646,215

[38]

  1. For each of the RF models, two datasets was considered and the model outcome was compared with the published results