Skip to main content

Table 1 Characteristics of the Combined Dataset.

From: Building prognostic models for breast cancer patients using clinical variables and hundreds of gene expression signatures

   

Training (~2/3)

Testing (~1/3)

 

Characteristics

Total N

%

N

N

P-value*

Subjects

 

550

 

359

191

-

ER

+

395

71.8%

259

136

0.89

 

-

155

28.2%

100

55

 

Size

< 2 cm

309

56.2%

198

111

0.56

 

≥ 2 cm

241

43.8%

161

80

 

HER2*

+

110

20.0%

73

37

0.88

 

-

440

80.0%

286

154

 

Grade

1

98

17.8%

63

35

0.45

 

2

182

33.1%

113

69

 
 

3

270

49.1%

183

87

 

Published Dataset^

Ivshina

137

24.9%

89

48

1

 

Loi

42

7.6%

28

14

 
 

NKI

141

25.6%

92

49

 
 

UNC

33

6.0%

22

11

 
 

Wang

197

35.8%

128

69

 

Platform

Affymetrix

376

68.4%

245

131

0.99

 

Agilent

174

31.6%

114

60

 

Subtype (PAM50)

Luminal A

156

28.4%

98

58

0.92

 

Luminal B

131

23.8%

85

46

 
 

HER2-enriched

83

15.1%

56

27

 
 

Basal-like

106

19.3%

72

34

 
 

Normal Breast-like

74

13.5%

48

26

 
  1. *HER2 status is based on ERBB2 mRNA levels. P-values have been calculated based on a Chi-square test.
  2. ^compiled from Ivshina et al., 2006; Loi et al., 2007; van de Vijver et al., 2002; Wang et al., 2005; http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE15393.