Human breast cancer associated fibroblasts exhibit subtype specific gene expression profiles

Background Breast cancer is a heterogeneous disease for which prognosis and treatment strategies are largely governed by the receptor status (estrogen, progesterone and Her2) of the tumor cells. Gene expression profiling of whole breast tumors further stratifies breast cancer into several molecular subtypes which also co-segregate with the receptor status of the tumor cells. We postulated that cancer associated fibroblasts (CAFs) within the tumor stroma may exhibit subtype specific gene expression profiles and thus contribute to the biology of the disease in a subtype specific manner. Several studies have reported gene expression profile differences between CAFs and normal breast fibroblasts but in none of these studies were the results stratified based on tumor subtypes. Methods To address whether gene expression in breast cancer associated fibroblasts varies between breast cancer subtypes, we compared the gene expression profiles of early passage primary CAFs isolated from twenty human breast cancer samples representing three main subtypes; seven ER+, seven triple negative (TNBC) and six Her2+. Results We observed significant expression differences between CAFs derived from Her2+ breast cancer and CAFs from TNBC and ER + cancers, particularly in pathways associated with cytoskeleton and integrin signaling. In the case of Her2+ breast cancer, the signaling pathways found to be selectively up regulated in CAFs likely contribute to the enhanced migration of breast cancer cells in transwell assays and may contribute to the unfavorable prognosis of Her2+ breast cancer. Conclusions These data demonstrate that in addition to the distinct molecular profiles that characterize the neoplastic cells, CAF gene expression is also differentially regulated in distinct subtypes of breast cancer.


Background
Gene expression profiling of whole breast tumors has stratified breast cancer into several molecular subtypes that largely correlate with the expression status of three receptors in the tumor cells, namely estrogen (ER), progesterone (PR), and Her2-neu (Her2) [1,2]. The most common breast cancer subtype expresses either ER or PR but lacks Her2 expression. Breast cancers that do not express any of the 3 receptors, known as triple negative breast cancer (TNBC), and those that express Her2 (Her2+) are less common, comprising approximately 15% and 25% of all breast cancers respectively. Her2+ and TNBC have less favorable prognosis compared to ER + cancers [3,4]. How cancer cells acquire a specific molecular phenotype is uncertain. It has been postulated recently that the tumor stroma and the cancer cells may co-evolve to support the selection or enrichment of a specific cancer subtype [5].
Much of the earlier gene expression profile analyses of breast cancer were performed using RNA extracted from tumor samples comprised of at least 50% of tumor cells, with the tumor stromal cells being a minor but important component. As tumor cell survival and tumor progression are dependent on the tumor microenvironment, elucidating the symbiotic relationship between neoplastic cells and stromal cells is crucial to further our understanding of the pathogenesis of the disease [5][6][7][8]. This interdependency is reinforced by the recent identification of a stroma-derived gene signature that correlates with prognosis suggesting that the tumor stroma contributes significantly to the invasive and metastatic potential of tumor cells [9]. A unique breast cancer stroma signature has also been observed in women of African American descent compared to European American descent [10], while a stromal gene signature has been reported to predict response to chemotherapy [11]. These observations support the suggestion that intrinsic heterogeneities between the tumor stroma may correlate with patient-specific characteristics, prognosis, therapeutic response, and, perhaps, tumor subtypes. However, breast cancer subtype-specific differences have not yet been reported for the tumor stromal cells even though multiple studies have shown that the gene expression profiles of breast cancer associated fibroblasts (CAFs) are distinctly different from their normal counterparts. None of these prior studies had stratified their results based on tumor subtypes [12][13][14][15][16].
In this study, we isolated CAFs from twenty primary breast cancer samples representing three main subtypes (ER + (n = 7), TNBC (n = 7), Her2+ (n = 6)) and performed gene expression profile analyses on RNA isolated from these early passage CAFs. Subtype-specific gene expression profile differences were observed that distinguished CAFs derived from Her2+ cancers and TNBC and ER + cancers. Several genes, e.g. ITGA3, ITGA5, CFL1, and RHOA, that were found to be selectively up regulated in CAFs derived from Her2+ but not ER + or TNBC breast cancers are known to be involved with pathways associated with integrin and RhoA signaling suggesting that CAFs may contribute to the invasiveness of Her2+ breast cancer [17]. Migration of breast cancer cells,T47D, was significantly enhanced by CAFs derived from Her2+ breast cancer compared with ER + or TNBC. Our findings suggest that CAFs might contribute to the biology of the disease in a subtype-specific manner. Our findings are also consistent with the recently proposed tumorstroma co-evolution hypothesis [5].

Patients and clinical characteristics of study cohort
Women with primary operable breast cancer undergoing breast surgery at the Hospital of the University of Pennsylvania were asked to participate in our tissue banking protocol approved by the institutional review board. Informed consent was obtained from all participants. Our study cohort included 20 women diagnosed with breast cancer between 2008 and 2011. Breast tumors were stratified into three subgroups according to receptor expression determined by immunohistochemistry (IHC) as described previously [18]: 1) ER + denotes breast cancer which expresses either ER or PR and lacks Her2 expression (n = 7); 2) TNBC denotes breast cancer that lacks expression of ER, PR, and Her2 (n = 7); and 3) Her2+ group (n = 6) denotes breast cancer which expresses Her2 as determined by IHC and/or fluorescence in situ Figure 1 qRT-PCR validation. qPCR was used to validate microarray results for 6 genes found to be significantly different in either Her2+ vs ER +, Her2+ vs TNBC or ER + vs TNBC comparison in microarrays data. Expression for arrays and qPCR were normalized separately over average value across absolute expression for Her2 +, ER + and TNBC groups. Error bars represent standard error of mean for the group. hybridization with (n = 1) or without expression of ER or PR (n = 5). All data collection and analyses were adherent to Institutional Review Board approved protocols. Clinical characteristics, including age at diagnosis, race, histology, tumor size, tumor grade, and number of involved (+) axilla nodes were compared. Pair-wise comparison was done using two-tail t-test for age and tumor size, and Fisher's exact test for race (Caucasian vs. African-American), histology, tumor grade (II vs. III) and number of (+) axilla nodes (none vs. one or more).

Tissue dissociation and cell culture
After our surgical pathologists completed gross examination and inking of the tumor specimen, fresh tumor tissue was taken from the center of the tumor without interfering with margin assessment as determined by the pathologists. The tissues were stored in ice cold medium DMEM/F12 supplemented with 10% fetal bovine serum (FBS), penicillin and streptomycin. The fresh tumor tissue was kept on ice at 4°C until ready for processing within 6 hours from the excision time. If the tumor tissue weighed less than 0.5 gram (n = 5) (TB160 -TB165), the tissue was mechanically dissociated by mincing with scalpel and scissors to 1-2 mm 3 in a 10 cm tissue culture plate. Fibroblast growth medium (DMEM supplemented with 10% FBS penicillin and streptomycin) was then added. After several days, outgrowth of spindle shaped cells was observed. Tissue debris and non-adherent cells were removed and medium changed between day 2-4. For tissues (n = 14) weighing more than 0.5 gram (TB71 -TB148) the tissue was minced as described above and then enzymatically dissociated in tissue digestion buffer containing collagenase I (Worthington), hyaluronidase (Sigma), Collagenase IV (Worthington) at 1 mg/ml of each enzyme in DMEM/ F12 medium in a volume of 1:5 ratio of tumor to buffer (wt/vol) on a gyrating platform at 37 o C for 30 min. The digestion was quenched by addition of fibroblast growth medium and filtered through a 70 μm cell strainer. Cells were pelleted at 1500 rpm for 10 min. Tissue debris and non-adherent cells were removed during medium change between day 2 or 4. By 10 -14 days, near confluent adherent spindle shaped cells were harvested using 0.25% trypsin in versene, washed and replated in fresh fibroblast growth medium. Medium was changed every 4 -7 days. CAFs from early passages (passage 2-3) were harvested and the cell pellet was stored in RNA later (Applied Biosystems) at −80°C until RNA was isolated.

RNA purification and microarrays
RNA purification was carried out using TRI Reagent W (Molecular Research Center) according to manufacturer's recommendations. RNA quality was determined using the Bioanalyzer (Agilent). Only samples with RIN numbers > 7.5 were used for further studies. Equal amounts (400 ng) of total RNA was amplified as recommended by Illumina and hybridized to the HumanHT-12 v4 human whole genome bead arrays. Illumina Bead-Studio v.3.0 software was used to export expression levels and detect p-values for each probe of each sample. Quality control of each array was performed using median Spearman correlation computed against all other arrays. Arrays whose median correlation differed from the global correlation by more than 8 absolute deviations were marked as outliers and not used for further analysis (resulting in the removal of one TNBC sample, TB147 (Table 1)). The remaining 19 arrays were then quantilenormalized between each other and filtered to remove non-informative probes (probes with a detection pvalue > 0.05 in all samples). Between-batch normalization was performed using Distance Weighted Discrimination (DWD) approach [19] using 4 samples replicated in the 2 microarray batches. Average expression between replicates was used for data analysis. The data was submitted to GEO database (http://www.ncbi.nlm.nih.gov/geo/) and available by using accession number GSE37614.
Flow cytometry analysis 1Adherent early passage CAFs were harvested with 0.05% trypsin/versene, washed in standard FACS buffer containing (5 ul/test) Fc blocking antibodies as List of samples divided into two batches (b1 and b2) including two samples from each subtype as an independent validation (testing) set as indicated.

Independent validation
We randomly selected two samples from each Her2+, ER + and TNBC subtype as an independent validation set (testing set Table 1). One sample which was unique in its subtype classification in that the CAF was derived from a Her2 + and ER + breast cancer (TB148, Additional file 1: Table S1) was also added to the testing set in order to show how it would be classified based only on its gene expression profile. The training set used to select the genes that distinguish the 3 CAF subtypes included 3 Her2+, 5 ER + and 4 TNBC samples was analyzed with one way ANOVA to identify a list of significant genes with pvalue < 0.05 used as a significance threshold. Expression patterns of the significant genes were used for Principal Component Analysis. Projection of training and testing set samples on the first two principal components was used to visualize relationship between samples.

Differentially expressed genes
After the validation, a final list of significant genes differentially expressed between three classes of samples (Her2+, ER + and TNBC) was determined by using one way ANOVA on the full set of samples, except for the one Her2+/ER + sample (TB148). False discovery rate (FDR) was determined according to published protocol [22]. Significance for genes between each pair of groups was determined by Tukey post-hoc test. P-value <0.05 was set as a significance threshold.

Gene enrichment analysis
Identification of biological functions and pathways overrepresented in any gene list was done using DAVID [23] and Ingenuity Pathway Analysis (IPA) software (Ingenuity Systems, Redwood City, CA). DAVID results were restricted to gene ontology (GO) terms, KEGG, and BIOCARTA pathways and Swiss-Prot keyword enrichments and filtered to satisfy FDR <5% and fold enrichment >2 criteria. Significance of IPA results was defined by Benjamini-Hochberg corrected for multiple testing p-value < 0.05.

Heatmap
Heatmap was generated for a list of the 44 significant genes (with a fold change > 2) that distinguish Her2+ CAFs from both ER + and TNBC derived CAFs. Genes were hierarchically clustered using Spearman correlation distance and complete linkage. Heatmap color intensities were proportional to a value calculated as a ratio between the gene expression in a single sample and the geometric mean expression of the gene across all samples.

qPCR validation
Expression of six genes, ITGA3, ITGA5, OXTR, WNT5B, BCAR1 and FZD1, as well as 3 endogenous controls (ec) RPL19, TBP and UBA5 were assessed by qRT-PCR in triplicates. Median Ct values for each gene were used for ΔΔCt analysis, where ΔCt was calculated against average Ct of the three endogenous controls and ΔΔCt calculated as difference between average ΔCt values of compared groups. Final fold change between a pair of groups was calculated as 2 ΔΔCt . Significance of the difference between two groups was tested by two-tail t-test on ΔCt values. For comparison with expression values from microarrays, corrected for loading bias absolute expression values E for each gene G were calculated as follows: E = AE G /(AE ec /avg(AE ec )), where absolute expression AE G = 2 40-Ct , AE ec is an average AE between three endogenous controls and avg(AE ec ) is an average of AE ec taken across all samples. Expression values were then normalized for microarray and qRT-PCR data separately over three group average absolute expression values.

Transwell migration assay
The migration properties of T47D (ATCC), a breast cancer cell line, known to have low migratory properties [24], was evaluated in the presence or absence of CAFs derived from ER, TNBC, and Her2+ breast cancer using a transwell assay. CAFs (1×10 4 cells) from each of the three subtypes were seeded in 100 μl of DMEM containing 1% serum medium in the lower well of a Transwell chamber (Costar, Inc.) with 8 μm pore size polycarbonate filters and left to attach for 90mins. As control, medium containing no CAFs was placed in the lower well. T47D (1×10 4 cells) were then seeded onto the upper chamber in 1% serum medium. Transwell chambers were incubated for 48 hours at 37°C and 5% CO 2 . Membranes were stained with DAPI (Invitrogen) for 15 min, rinsed with PBS and fixed with 10% buffered formalin (Fisher Scientific, SF100-20) for 15 min before imaging. The number of T47D cells that migrated onto the underside of the membrane was counted in 5 fields using a Nikon TE2000 inverted microscope at 10× magnification and plotted. Statistical evaluation was performed using Graph Pad Prism (GraphPad Software, Inc.)

Isolation of CAFs from fresh human breast cancer samples
The clinical characteristics of the study cohort are summarized in Table 2. Detailed clinical characteristics of each tumor are provided in Additional file1: Table S1. No significant differences were noted among the three subgroups, except for tumor grade ( Table 2). The morphology of CAFs isolated from the 3 different breast cancer subtypes was similar ( Figure 1). Further phenotypic characterization using flow cytometry analysis demonstrated that >95% of these cells expressed fibroblast activation protein (FAP), a previously identified marker of cancer associated fibroblasts [25][26][27][28]. Moreover, >99% of the cells were negative for the epithelial cell adhesion molecule (EpCAM), a breast cancer epithelial cell surface marker [12]; CD31, also known as platelet endothelial cell adhesion molecule (PECAM-1), an endothelial cell marker, and CD45, a panleukocyte marker (Figure 2, lower panel). Moreover, these CAFs uniformly expressed vimentin and collagen by immunohistochemistry (data not shown).
Gene expression profile analyses of CAFs derived from TNBC, ER + and Her2+ breast cancer RNA isolated from the early passage CAFs were assayed for gene expression and randomly assigned to two sample sets, namely, training and testing sets (Table 1) to perform independent validation. Using one-way ANOVA on the training set (4 TNBC samples, 5 ER + samples and 3 Her2+ samples)), we identified 782 genes that were differentially expressed between TNBC, ER + and Her2+ samples (p-value < 0.05). In order to visualize the relationships between the sample types, we performed unsupervised Principal Component Analysis using the 782 significant genes ( Figure 2A). This type of plot reflects the similarities and differences between all samples in relation to the 782 significant genes. It should be noted that the first principal component plotted on the X axis accounts for 49% of the variation in the data and indicates that there are significant differences between the CAFs derived from the Her2+ cancers and both the TNBC and ER + breast cancers, as these samples are equally separated from the Her2+ samples along the X axis. The second principal component plotted on the Y axis accounts for only 14% of the gene expression variation between all samples. It captures putative differences between the ER + and TNBC samples and indicates that the expression profiles are much more similar between these two subtypes. We then determined whether the training set principal components could also distinguish the new Her2+, ER + and TNBC patient samples thus validating our initial observations. Figure 2A shows the separation of the 12 samples representing the 3 original sample types in the training set that we used to select the significant genes that defined this separation. Figure 2B confirms these genes also identify the subtype differences in new samples analyzed as an independent validation set and included two new Her2+ samples and t two new ER + and two new TNBC samples. The new Her2+ samples clearly cluster with the Her2+ samples in the training set while the new ER + and TNBC samples once again cluster with the ER + and TNBC training set samples. Although the ER + and TNBC derived CAFs appear to self segregate along the 2 nd principal component in the training set (Figure 2A), no significant differences in gene expression were detected between the ER + and TNBC CAFs in the testing set ( Figure 2B). This  indicates that there is a high degree of gene expression similarity in the CAFs associated with the ER + and TNBC cancer subtypes. It should also be noted that new sample TB148, which is both Her2+ and ER+, co-segregates with the Her2+ samples which were all ER-( Figure 2B), indicating the presence of a gene expression profile more similar to the Her2+ CAFs and not the ER + CAF sample group. This indicates a dominance of Her2+ CAF gene expression signature over ER + CAF signature.

Genes
We also combined the expression data for all samples (except for the Her2+/ER + TB148) to take advantage of the larger sample size and ran one way ANOVA to define a final list of significant genes differentially expressed between Her2+, ER + and TNBC in the larger data set. We found 1829 differentially expressed genes with p-value < 0.05 and estimated false discovery rate of 28%. When the relationships between the different CAF subtypes were reassessed using Principal Component Analysis with the new gene set, we found the same cancer subtype specific differences as demonstrated on training subset (Figure 2A).
The number of significant genes identified by pairwise comparisons (Tukey post-hoc test) between the three classes of patient samples, i.e. Her2+ vs ER+, Her2+ vs. TNBC and ER + vs TNBC samples, are presented in the Venn diagram in Figure 3. These results quantify the visual interpretation of Principal Component Analysis demonstrating that while 1,800 genes were significantly differentially expressed between Her2+ and either ER + or TNBC, only 118 genes were significantly different between ER + and TNBC derived CAFs. Further studies with increased number of samples for ER + and TNBC derived CAFs will be required to identify genes that can discriminate those 2 classes, if they exist. A gene expression heat map for the 44 most changed unique genes (fold change > 2) which were common to the Her2+ vs ER + and Her2 + vs TNBC comparisons are shown in Figure 4.

Functions and pathways over-represented in the list of genes that distinguish Her2+ from ER + and TNBC CAFs
We compared the two significant gene lists for Her2+ vs ER + and Her2+ vs TNBC to identify functions or pathways that might be over-represented among the differentially expressed genes. Results with DAVID software analyses [23] are shown in Additional file 2: Table S2 for the Her2+ vs ER + 1253 significant genes, and in Additional file 3: Table S3 for Her2+ vs TNBC 1035 significant genes. Enrichment of nine functional categories associated with cytoskeleton and extracellular matrix were found to be significant in both comparisons.
Ingenuity pathway analysis was done for a list of 615 genes common between Her2+ vs ER + and Her2+ vs. TNBC comparisons. A list of significantly enriched canonical pathways is presented in Table 3. Pathways Table 3 Canonical pathways upregulated in Her2+ compared to ER + and TNBC samples involving extracellular matrix/integrin signaling were found to be significantly up-regulated in CAFs derived from Her2+ cancer, further supporting the DAVID results. It should be noted that 92% (61 of the 66 unique) of the genes associated with the ingenuity pathways are upregulated in Her2+ supporting the hypothesis that those pathways are more active in CAFs derived from Her2+ breast cancer as compared to those derived from the ER + and TNBC breast cancers.

Q-RT-PCR validation of individual gene expression data in CAFs
To confirm differential gene expression levels in the three breast cancer subtypes, Her2+, ER + and TNBC, we selected 6 genes (ITGA3, ITGA5, OXTR, WNT5B, BCAR1, FZD1) with significantly different levels of expression based on our microarray studies and validated their expression levels by qRT-PCR. Fold changes in expression based on the arrays ranged from 1.5 fold to 6.9 fold. Five of the 6 genes that were found to be expressed at higher levels in the Her2+ samples were also significantly different in the Her2+/ER + qRT-PCR comparison; and 4 of those 5 genes that were significantly different in the Her2+/TNBC array comparison were also significantly different by qRT-PCR comparison ( Figure 5. and Additional file 4: Table S4). Expression ratios by qRT-PCR were highly consistent with array values and overall somewhat higher by qRT-PCR as expected. One gene, FZD1, which was expressed at lower levels in CAFs derived from Her2+ breast cancer by array analyses, was also significantly lower by qRT-PCR in the Her2/TNBC comparison but was not significantly different in the ER/ TNBC comparison (P = 0.2) although fold change values were similar by qRT-PCR (TNBC/ER + = 1.33 for microarrays and 1.39 for qPCR).

Her2 CAFs enhanced the migratory phenotype of breast cancer cells in vitro
To explore whether CAFs derived from various breast cancer subtypes can differentially enhance the migratory phenotype of breast cancer cells, we performed in vitro transwell assays comparing the migration of breast cancer cells cultured in the presence or absence of CAFs isolated from ER+, Her2+ and TNBC. The number of migrated T47 cells onto the membrane surface that was facing the lower chamber was counted. Results were analyzed by unpaired Kruskal-Wallis test. The level of statistical significance was taken as P < 0.05. As our gene expression profile results have predicted, CAFs derived from Her2+ breast cancer significantly enhanced the migration of T47D ( Figure 6).

Discussion
Robust evidence is now available that underscores the role of CAFs in tumor progression [8,[28][29][30][31][32][33]. Previous gene expression profile analyses comparing CAFs and fibroblasts derived from matched normal adjacent breast tissues have demonstrated significant differences between the CAF and their normal counterparts but, to the best of our knowledge, no prior studies have addressed whether CAFs derived from various breast cancer subtypes harbor subtype specific gene expression signatures.
In this study we demonstrate for the first time that CAFs from several breast cancer subtypes exhibit subtypespecific gene expression profiles. Specifically, we show that the gene expression profile of CAFs derived from Her2+ breast cancers are significantly different from CAFs derived from ER + or TNBC breast cancers. Heterogeneity among fibroblasts has been described in various organ sites including lung, skin, sclera and orbit [34]. Furthermore, Sugimoto and coworkers demonstrated that the expression of various fibroblast markers are heterogeneous within the tumor stroma in mouse breast and pancreatic tumor models using immunohistochemical analyses [35]. Several studies have generated gene expression profiles from breast cancer-associated fibroblasts but none of these studies have stratified their results based on tumor subtypes. Work by Allinen and coworkers evaluated gene expression profiles of breast cancer stromal cells which were isolated by negatively selecting out epithelial cells, lymphocytes and endothelial cells [12]. Work described by Singer et al. compared gene expression profiles of stromal fibroblasts derived from 10 invasive breast cancers with stromal fibroblasts derived from normal breast tissues of 10 women undergoing breast reduction surgery [16]. Their results demonstrated increased expression of tumor promotion-associated genes in the pooled CAFs. Work by Bauer et al. (2010) evaluated gene expression profiles of fibroblasts derived from 6 matched breast cancers and adjacent normal breast tissues [13] and found distinct differences between CAFs and normal fibroblasts, specifically in genes related to paracrine or intracellular signaling, transcriptional regulation, extracellular matrix and cell adhesion/ migration. However, all of the above studies were not designed to test subtype specific differences in CAFs due to these studies' relatively small sample size. In addition, when tumor subtype data were reported, the less common breast cancer subtypes, i.e., Her2+ or TNBC cancer, were underrepresented.
Our results showed that CAFs derived from Her2+ breast cancers significantly up-regulated pathways associated with actin cytoskeleton and integrin signaling (Table 3). Integrins mediate cell attachment with extracellular matrix (ECM) to provide traction necessary for cell motility and invasion. These upregulated signaling pathways may have contributed to the elevated migratory phenotype of breast cancer cells (T47D) in our in vitro transwell assays (Figure 1).
The extracellular matrix and integrins collaborate to regulate gene expression associated with cell growth, differentiation and survival; all of which are deregulated during cancer progression and metastasis. A recent study using a three-dimensional squamous cell carcinoma (SCC)/fibroblast co-culture model elegantly demonstrated the role of three genes, integrin α3, integrin α5 and Rho, in promoting a fibroblast-led collective invasion of SCC cells into the extracellular matrix [17]. Interestingly, all three genes were significantly up-regulated in CAFs derived from Her2+ breast cancer with integrin signaling as the second most enriched pathway (Table 3). Moreover, many of the genes and pathways downstream of integrin signaling are also significantly upregulated in Her2+ CAFs. These include focal adhesion kinase (FAK), Rac and Rho signaling pathways as well as several members of the mitogen-activated protein kinases (MAPKs), further underscoring the importance of integrin signaling in CAF. In addition to the well-established role of integrins in migration and invasion, integrins can also regulate cell proliferation, including mammary gland proliferation [36] through integrin-linked kinase (ILK) [37], which was also noted to be significantly upregulated in HER2+ derived CAFs. These characteristic differences in CAFs derived from Her2+ breast cancer may contribute to the aggressiveness of this particular breast cancer subtype which is known to have an increased propensity for local and distant recurrence [3]. In addition, the sites of distant metastasis appear to differ according to breast cancer subtype with Her2+ breast cancer having a higher rate of brain, liver, and lung metastases than ER + breast cancer [38]. The role of CAF in contributing to a subtypespecific trophism for the various distant metastatic sites is unknown.
Gene expression profile differences between CAFs derived from ER + and TNBC breast cancer were less pronounced and we were unable to confirm them with independent validation set using the limited sample numbers ( Figure 2B). While it is possible that true differences may exist among these two subtypes, a larger number of samples would be required to find those differences with an acceptable false discovery rate.

Conclusions
Our results show that subtype specific changes exist in CAFs derived from breast cancer. In the case of Her2+ breast cancer, a more aggressive breast cancer subtype with known increased risk of local and distant recurrence, CAFs may augment the invasive properties of the tumor cells via pathways associated with cytoskeleton and integrin signaling. Our findings also provided molecular evidence supporting a recently proposed tumorstroma co-evolution hypothesis which suggested that the tumor microenvironment, e.g. CAFs, may adopt specific changes to optimize the survival/propagation of a specific tumor cell type [5]. Whether these programmatic differences in CAFs result from epigenetic changes or whether these differences are due to heterogeneity within the CAF population, i.e. proportion of resident fibroblasts vs. recruited fibroblasts, or fibroblasts derived from epithelial mesenchymal transition are unknown. In addition, whether CAFs contribute to tumor progression in a subtype specific manner is unknown. How CAFs and other components of the tumor microenvironment drive or are being driven by the tumor cells to promote the propagation and maintenance of a specific tumor subtype will be the subject of future work.

Additional files
Additional file 1: Table S1. Clinical Characteristics of Study Cohort.
Additional file 2: Table S2. Annotation categories enriched in the list of genes significantly differentially expressed in Her2+ compared to ER+ samples as determined by DAVID software. Cat=category, Term=enriched annotation term, Enr=enrichment, TN=enrichment of the Term in Her2+ vs. TNBC comparison, Sens=sensitivity in a form K/N(P%), where K=number of genes in the list, N=total known number of genes, P=K/N in percentage. P=Fisher exact p-value for enrichment, FDR=false discovery rate, " = number of genes upregulated in Her2+, # = number of genes downregulated in Her2+, SP.KW = SwissProt keyword, KEGG=KEGG pathway, GO=gene ontology, BP=biological process, FM=molecular function, CC=cellular component.
Additional file 3: Table S3. Annotation categories enriched in the list of genes significantly differentially expressed in Her2+ compared to TNBC samples as determined by DAVID software. Cat=category, Term=enriched annotation term, Enr=enrichment, ER+=enrichment of the Term in Her2+ vs. ER+ comparison, Sens=sensitivity in a form K/N(P%), where K=number of genes in the list, N=total known number of genes, P=K/N in percentage. P=Fisher exact p-value for enrichment, FDR=false discovery rate, " = number of genes upregulated in Her2+, # = number of genes downregulated in Her2+, SP.KW = SwissProt keyword, GO=gene ontology, BP=biological process, FM=molecular function, CC=cellular component.
Additional file 4: Table S4. Fold changes and p-values obtained by qRT-PCR validation experiment for 6 genes found to be significantly different in either Her2+ vs ER+, Her2+ vs TNBC or ER+ vs TNBC comparison in microarrays data. FC=fold change, P=significance by t-test. Visual comparison of expression values between microarrays and qRT-PCR are presented in Figure 6.