Skip to main content
Fig. 3 | BMC Medical Genomics

Fig. 3

From: A random forest based biomarker discovery and power analysis framework for diagnostics research

Fig. 3

Validation model performance and power analysis of published experimentally derived data 1, regression mode. a Boxplots displaying the variance in the observed R-squared value of validation models trained using the stable features selected by each feature selection approach, across four outer-loop CV repeats. Values are shown for models trained using either the features selected by each approach in at least 5/100 iterations (Low Stringency) or a minimum of 90/100 iterations (High Stringency). b The three groups of correlated features identified by the power function are represented by the group member with the largest observed effect size. The effect size of each assessed variable is shown along the y axis and a series of sample sizes along the x axis. Power values determined for each effect/sample size combination using a simulated dataset with the same correlation structure as input data and displayed using variably sized/coloured rhombi

Back to article page