Skip to main content

Table 3 Effect of dataset composition on differential gene expression.

From: The removal of multiplicative, systematic bias allows integration of breast cancer gene expression datasets – improving meta-analysis and prediction of prognosis

 

SAM Common, top 1000

Uneven comparisons

Between datasets

Across datasets

 

UC

MC

wMC

DWD

UC

MC

wMC

DWD

Unamplified MCF7 (3) v MCF10A (3)

Amplified MCF7 (3) v MCF10A (3)

522

(0.031)

(0.032)

522

(0.031)

(0.032)

-

427

(0.029)

(0.028)

251

(0.023)

(0.037)

594

(0.025)

(0.035)

-

447

(0.023)

(0.032)

Unamplified MCF7 (3) v MCF10A (3)

Amplified MCF7 (3) v MCF10A (2)

495

(0.031)

(0.036)

495

(0.031)

(0.036)

495

(0.031)

(0.036)

469

(0.03)

(0.031)

232

(0.026)

(0.035)

600

(0.024)

(0.037)

597

(0.026)

(0.040)

550

(0.028)

(0.0)

Richardson et al. Non-basal (12) v basal (12)

Farmer et al. Luminal A (12) v basal (12)

394

(0.003)

(0.019)

394

(0.003)

(0.019)

-

389

(0.003)

(0.019)

368

(0.001)

(0.019)

708

(0.047)

(0.02)

-

695

(0.046)

(0.014)

Richardson et al. Non-basal (7) v basal (19)

Farmer et al. Luminal A (15) v basal (14)

380

(0.019)

(0.001)

380

(0.019)

(0.001)

380

(0.019)

(0.001)

373

(0.001)

(0.017)

346

(0)

(0)

725

(0.003)

(0.078)

608

(0.002)

(0.038)

658

(0.005)

(0.021)

Richardson et al. Non-basal (3) v basal (19)

Farmer et al. Luminal A (15) v basal (3)

283

(0.1)

(0.194)

283

(0.1)

(0.194)

283

(0.1)

(0.194)

258

(0.195)

(0.099)

290

(0)

(0.027)

480

(0.093)

(0.9)

684

(0.001)

(0.789)

506

(0.112)

(0.9)

  1. Sets of differentially expressed probesets comparing MCF7 and MCF10A replicates or basal/basal-like and luminal/nonbasal-like tumours were identified for each experiment, before and after mean batch-centering, comparisons both between and across datasets were performed. SAM Common: for each column two different pairwise comparisons using SAM were performed, and the top 1000 probesets identified for each comparison. The number reported is the intersection between the two sets. Before: comparison was performed prior to mean batch-centering. After: comparison was performed following mean batch-centering. Values in brackets are the FDR for each top 1000 probesets. Weighted mean-centering for datasets with even numbers of samples are not shown as the values are identical to mean-centering. UC = uncorrected, MC batch mean-centered, wMC = weighted mean-centered, DWD = distance-weighted discrimination.