Skip to main content

Table 1 Characteristics of the Combined Dataset.

From: Building prognostic models for breast cancer patients using clinical variables and hundreds of gene expression signatures

    Training (~2/3) Testing (~1/3)  
Characteristics Total N % N N P-value*
Subjects   550   359 191 -
ER + 395 71.8% 259 136 0.89
  - 155 28.2% 100 55  
Size < 2 cm 309 56.2% 198 111 0.56
  ≥ 2 cm 241 43.8% 161 80  
HER2* + 110 20.0% 73 37 0.88
  - 440 80.0% 286 154  
Grade 1 98 17.8% 63 35 0.45
  2 182 33.1% 113 69  
  3 270 49.1% 183 87  
Published Dataset^ Ivshina 137 24.9% 89 48 1
  Loi 42 7.6% 28 14  
  NKI 141 25.6% 92 49  
  UNC 33 6.0% 22 11  
  Wang 197 35.8% 128 69  
Platform Affymetrix 376 68.4% 245 131 0.99
  Agilent 174 31.6% 114 60  
Subtype (PAM50) Luminal A 156 28.4% 98 58 0.92
  Luminal B 131 23.8% 85 46  
  HER2-enriched 83 15.1% 56 27  
  Basal-like 106 19.3% 72 34  
  Normal Breast-like 74 13.5% 48 26  
  1. *HER2 status is based on ERBB2 mRNA levels. P-values have been calculated based on a Chi-square test.
  2. ^compiled from Ivshina et al., 2006; Loi et al., 2007; van de Vijver et al., 2002; Wang et al., 2005; http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE15393.