This study confirms that a population based cohort study such as NOWAC provides the opportunity to use high throughput technology (e.g., microarray analysis) to explore biologic variation in gene expression related to both endogenous and exogenous sex hormones.
According to the gene-wise analysis, hormone concentrations did not show a profound influence on gene expression. This result is not surprising, given the low variability that is present in a study group representing the general postmenopausal population. Conversely, all categories of HT use produced differentially expressed genes when compared with non-users. This finding is attributable to the wider range of hormone concentrations between the groups in this analysis. Intake of exogenous E2, particularly by systemic administration, increases endogenous plasma E2 and suppresses plasma FSH toward premenopausal levels . Apart from the direct hormonal effects, a probable cause is the supply of synthetic medical substances (e.g., tibolone, progestogens and their metabolites) to the blood. The overlap between the gene sets shown in Figure 2 was probably caused by the overlap of subjects; all of the women using systemic E2 alone are included among the women using E2P systemic, who are again included among the HT-users. Also, these genes seem the most stable, as they remain significant even if the sample group composition is changed. The minimal information on gene function for the two E2 gene sets in DAVID could be due to the nature of gene-wise analysis, which assumes that genes are expressed independently of each other.
The gene set enrichment analysis showed a fair amount of overlap between P4 and E2, a plausible result considering the positive correlation between the two hormones (r = 0.43, p < 0.01). Among the 58 subjects present in both the E2 and the P4 analyses, 49 were concordantly in the low or high group for both hormones. Hence, it may be difficult to disentangle gene expression associated with E2 and P4. Still, there are some differences. For example, the "oestrogen up-regulated" gene set (Frasor/KEGG) was only significant for E2, and although the total "oestrogen regulated" gene set (Frasor/KEGG) was significant for both E2 and P4, there was no overlap between the core genes up-regulated in the high group. In general, we found a much larger overlap of core genes up-regulated in the low than in the high group for the gene sets that were significant for both hormones.
The "E2 or E2/P systemic" gene set turned out to be a more reliable oestrogen signature than the "E2 alone" gene set, probably because of the inclusion of oral high-dose E2 users (n = 7) and/or the generally larger group of users in the "E2 or E2/P systemic" category. Interestingly, as opposed to most of the other significant gene sets, the majority of the genes in this gene set were up-regulated in the high E2 group. There was a high, although not complete, concordance in the direction of gene expression between endogenous and exogenous E2 for this gene set. Opposing directions for some genes may have been due to the progestogen content in several of the products in this HT category, or possibly differential feedback mechanisms between endogenous and exogenous hormones. Further research may reveal the functions and regulation of these core genes.
In addition to the core genes, interesting single genes in the "E2 or E2/P systemic" gene set included PNRC2, a coactivator of nuclear receptors such as the ESRRs (oestrogen related receptors), and CSMD2, which has been previously found to be differentially expressed in HT users . Among the remaining genes aspiring to, but not quite reaching, core gene status (Figure 4), are also VPS37A (no. 8), RNF139 (No. 9) and CNOT7 (No. 12), indicating that these overlapping genes from Figure 2 are worthy of further research into their association with sex hormones. Noteworthy in the "E2 alone" gene set is the TSC22D1 gene, whose protein product may play a role in resistance toward Tamoxifen® treatment in breast cancer patients
The tibolone and thyroxine gene sets did not meet our significance criteria for any of the hormones. One might have expected some association with FSH, but the number of users in these two categories was probably too small to generate reliably specific expression sets.
None of the gene sets were differentially expressed between high and low levels of FSH, SHBG or T. Compared with the wide variety of target tissues and the acknowledged effects of steroid hormones, FSH and SHBG would be expected to have a more limited association with gene expression. The biological effect of FSH is essentially the stimulation of gonadal E2 and P4 synthesis, and in postmenopausal women FSH has lost its gonadotropic potency. Although it has been suggested that SHBG possesses some signalling properties , it is mainly a transport protein. Adding the moderate variation in FSH and SHBG levels across the study population, a difference in gene expression might be difficult to detect. Testosterone is not a major hormone in women. Although it is a potent steroid, the differences in gene expression relative to low levels of T are probably not detectable in a setting with high background variability.
Seven gene sets related to immune responses or cells active in the immune system were differentially expressed between the high and low E2 concentration groups. Additionally, two gene sets associated with exercise (stress response and inflammatory response) and the proto-oncogene gene set could be viewed as immune system related. Sex hormones have been found to influence the immune system through steroid receptors in white blood cells . In general, female sex hormones are viewed as suppressors of the immune response. It has been shown that plasma levels of both interleukin 6 (IL6) and interleukin 2 (IL2) increase after menopause (i.e., with decreasing levels of E2) and that HT opposes this effect . Although neither IL6 nor IL2 were among the 16 185 probes in our data set, the higher expression of the respective receptors, IL6R and IL2R, at low E2 concentrations indicates the suppressive effects of E2. Other interesting core genes include the heat shock proteins (HSPs) in the "stress response from exercise" gene set. The HSPs function as intracellular chaperones for other proteins (integrity and folding), and some have been found to play a role in the rapid non-genomic effects of steroid hormones , which is interesting in light of the rapid responses seen in these genes following exercise . FOS is a high-influence core gene for both E2 and P4. In fact, all of the FOS-containing gene sets were differentially expressed. However, contradictory to Frasor et.al. , FOS was up-regulated in the low E2 group, together with EPB41L3 and AP1G1. By contrast, CXCL12, the steroid 21-hydroxylase CYP21A2 and PDZK1 were congruously up-regulated in the high-E2 group. According to Kendall et al. , FOS is up-regulated by oestrogen deprivation, which supports our results, while SGK3 and TAGLN are down-regulated, which contradicts our results. These contradictions in gene expression direction may arise from methodological differences or from regulatory and feed-back mechanisms similar to the above mentioned discordance for the "E2 or E2/P systemic" gene set.
Though the differential expression of the tibolone gene set lacked statistical significance, the network mapping suggested that further research is warranted. Interesting single genes included COMT, a central enzyme in the metabolism of oestrogens; and SOAT1, an intracellular protein that forms cholesterol esters from cholesterol, thereby possibly contributing to atherosclerotic plaques. SOAT1 was up-regulated among tibolone users, in accord with the known increased risk of stroke associated with tibolone use . A larger data set would contain a larger group of tibolone users and provide a more solid basis for finding tibolone associated genes.
Strengths and limitations
The NOWAC study subjects were randomly drawn from the Central Population Register and are representative of the population in which future microarray based diagnostic and/or prognostic tests for breast cancer will be applied. Our ability to detect subtle effects in a dataset with a high degree of random variation is reassuring.
Among the limitations of this study is the lack of information regarding the relative proportions of peripheral blood cell types. If differences in hormone concentrations or HT use are associated with the numbers of particular type(s) of peripheral blood cells, it may have influenced our results. Research into the influence of sex hormones on leukocyte cell count reveals conflicting results . Although the women were healthy enough to visit a physician's office, we had limited information regarding disease and immune system status beyond what could be inferred from the self-reported drug use. However, a systematic difference in disease prevalence between hormone concentration levels is unlikely.
Our FDR cut-off of <0.25 may have exceeded conventional limits, whereas FDR≤0.10 is considered acceptable . A higher FDR can be accepted, however, at least when analysing gene sets extracted from previous publications and thereby supported by research. Also, we were not looking for single genes in the gene-wise analyses but for groups of genes that may have explained known effects. For example, among the 33 genes in the "E2 or E2/P systemic" gene set, 9 had FDR≤0.10, but only two of these genes were among the core genes differentially expressed between the high and low E2 group. Hence, if we had used the ≤0.10 FDR cut-off we might have overlooked this oestrogen signature.
Our results are based on a snapshot measurement; we have only one blood sample from each woman and can infer nothing about intra-individual variation or variation over time. However, previous reports have shown low intra-individual variation in gene expression compared with inter-individual variation [42, 43].
The study design prevented an extensively standardised blood sampling protocol with regards to fasting, blood sample handling and transport, etc. However, the main source of technical variation in this data set was associated with the performance of the assay and not with pre-analytical processing.
The gene set enrichment analyses were adjusted for age and/or BMI. We found no significant differences between the compared categories with respect to fasting and smoking. However, residual confounding may have influenced the differences found between high and low concentrations of E2 and P4.
Differentially expressed genes have not been validated using an independent data set, and our results should be interpreted accordingly.