Skip to main content
Figure 4 | BMC Medical Genomics

Figure 4

From: The removal of multiplicative, systematic bias allows integration of breast cancer gene expression datasets – improving meta-analysis and prediction of prognosis

Figure 4

Combining datasets or tumours and mean-centering significantly increases prognostic prediction. A, Before mean batch-centering. B, After mean batch-centering. The R 2 statistic (Cox proportional hazards model) is an assessment of the performance of the predictor generated using each combination of training datasets and the remaining test datasets, generated using supervised principal components analysis. Median values are used where a training dataset was used to assess more than one test dataset (up to 5). R 2 and p-value results for all possible combinations of training datasets and test datasets (1016) are given in the matrix in Additional File 6.

Back to article page