- Research article
- Open Access
- Open Peer Review
This article has Open Peer Review reports available.
An empirical Bayes model for gene expression and methylation profiles in antiestrogen resistant breast cancer
© Jeong et al; licensee BioMed Central Ltd. 2010
Received: 22 June 2010
Accepted: 25 November 2010
Published: 25 November 2010
The nuclear transcription factor estrogen receptor alpha (ER-alpha) is the target of several antiestrogen therapeutic agents for breast cancer. However, many ER-alpha positive patients do not respond to these treatments from the beginning, or stop responding after being treated for a period of time. Because of the association of gene transcription alteration and drug resistance and the emerging evidence on the role of DNA methylation on transcription regulation, understanding of these relationships can facilitate development of approaches to re-sensitize breast cancer cells to treatment by restoring DNA methylation patterns.
We constructed a hierarchical empirical Bayes model to investigate the simultaneous change of gene expression and promoter DNA methylation profiles among wild type (WT) and OHT/ICI resistant MCF7 breast cancer cell lines.
We found that compared with the WT cell lines, almost all of the genes in OHT or ICI resistant cell lines either do not show methylation change or hypomethylated. Moreover, the correlations between gene expression and methylation are quite heterogeneous across genes, suggesting the involvement of other factors in regulating transcription. Analysis of our results in combination with H3K4me2 data on OHT resistant cell lines suggests a clear interplay between DNA methylation and H3K4me2 in the regulation of gene expression. For hypomethylated genes with alteration of gene expression, most (~80%) are up-regulated, consistent with current view on the relationship between promoter methylation and gene expression.
We developed an empirical Bayes model to study the association between DNA methylation in the promoter region and gene expression. Our approach generates both global (across all genes) and local (individual gene) views of the interplay. It provides important insight on future effort to develop therapeutic agent to re-sensitize breast cancer cells to treatment.
The term epigenetics in general refers to heritable pattern of gene expression that is mechanistically regulated through processes other than alteration in the primary DNA sequences [1, 2]. Epigenetics has implications in both our understanding of gene regulation in complex organisms such as mammals and clinical investigation on various diseases such as cancer [3, 4]. It is now clear that epigenetic events can occur at both the DNA level (i.e. DNA methylation) and chromatic level (i.e. histone modifications), resulting in an intricate process of interactions that ultimately lead to the alteration of gene expression [5–7].
DNA methylation is a process that adds a methyl group to the cytosine ring via a co-valent bond, using S-adenosyl-methionine as the methyl donor and DNA methyltransferases (DNMTs) as the catalytic enzyme . In mammals, DNA methylation is mostly common on cytosines that precede a guanosine (the CpG dinucleotide). Two features characterize the distribution of the CpG dinucleotides in the genome. First, the overall frequency of the CpG dinucleotides is substantially less than one would expect from probabilistic calculations, which is likely due to a depletion process induced by methylation over time . Second, the distribution of CpG dinucleotides in the genome is highly asymmetric with a high concentration of DNA segments 200bp to several kb in length called "CpG islands", residing in the promoter region and first exon for approximately 60% of genes . A striking feature that distinguishes CpG islands from CpG dinucleotides is that under normal conditions, CpG islands generally lack DNA methylation, whereas CpG dinucleotides are typically methylated (i.e. 80%) . While the relationship between CpG island methylation and gene silencing is well established, the mechanisms underlying this phenomena are less clear but thought to include physical blocking of transcription factor binding [9, 10] and/or recruitment of transcriptional repressors to the methylated sites .
A more complete understanding of the DNA methylation in carcinogenesis is beginning to emerge. A general observation is that the level and pattern of DNA methylation in cancer cells is the opposite of their normal counterparts. The cancer methylome is characterized by global hypomethylation of DNA, which is linked primarily to repeated DNA sequences becoming hypomethylated. Hypomethylation may contribute to carcinogenesis by promoting tumor formation or progression in a number of possible ways, including affecting transposable element activation, DNA/chromosomal rearrangements, tumor suppressor gene or oncogene copy number, and/or altered chromosome conformation. In contrast to normal cells, increased methylation of CpG islands is a common occurrence in cancer, and is associated with epigenetic silencing during all phases of the cancer process, including tumor initiation, progression and drug resistance. Aberrant CpG island methylation is associated with silencing of genes involved in control of the cell cycle, apoptosis and drug sensitivity, as well as tumor suppressor genes.
Although the above phenomena are well documented in all cancers and recognized as playing an important role in almost every aspect of carcinogenesis, the mechanistic nature of the relationship between methylation and regulation of gene expression remains incompletely understood, including the heterogeneity of the relationship among genes, the interaction of methylation at different sites and the involvement of other epigenetic events.
In the clinical setting, a critical issue for cancer treatment is acquired drug resistance, where patients initially respond to chemotherapy but cease to respond after repeated exposure to the same drug. Potentially, epigenetic alterations, such as DNA methylation, are likely to play an important role in acquired drug resistance, as suggested by several studies [12–15], though much work is yet to be done to gain a clear insight into this phenomenon. Based on our experience in studies of hormone-therapy resistance in breast cancer, antiestrogen resistance is accompanied by dramatic alterations in the expression level of many genes, and alteration of DNA methylation may be one of the causes.
In this article, we focused on understanding the association between CpG island methylation and gene expression in breast cancer. In particular, we attempted to gain a better understanding of differences in DNA methylation and gene expression between hormone-therapy-sensitive and -resistant cell lines. We considered two breast cancer cell lines that are resistant to tamoxifen and fulvestrant, respectively. These are two clinically important therapeutic agents that target estrogen receptor alpha (ER-alpha), a nuclear receptor that primarily mediates genomic regulation of gene transcription and non-genomic activation of various kinase pathways . It is well known that ER-alpha is a key protein implicated in the majority of breast cancers. Although both tamoxifen and fulvestrant are antagonists of ER-alpha, their mechanisms of action differ markedly . Tamoxifen functions as a competitive agent of E2 (the ligand that stimulates ER-alpha), blocking E2 binding to ER-alpha. In spite of this antagonistic action, tamoxifen-bound ER-alpha is capable of regulating gene transcription through genomic/non-genomic actions. On the other hand, fulvestrant directly inhibits the process through which ER-alpha executes genomic regulation function, rapidly inducing cytoplasm aggregation and ER-alpha degradation .
Based on their different mechanisms of action, the transition to a resistant state by constant exposure to these agents likely involves both similar and distinct molecular alterations. The aim of this study is to identify at both the individual gene as well as genome level the regulation status in both DNA methylation and gene expression by comparing drug-resistant cell lines to drug-sensitive cell lines. This study provides important insight on the search of potential targets for epigenetic therapy to re-sensitize tumor cells to hormone or chemo-therapy. Toward this goal, we developed an empirical Bayes statistical model to integrate gene expression and DNA methylation data. Advantages of such a model include (i) consideration of probe-probe variation, (ii) easily interpretable confidence of the detections and (iii) straight forward false discovery rate (FDR) control/estimate [18, 19].
The Human Genome U133A 2.0 Array was used for gene expression analysis. We restricted our analysis to probes with at least two "present" calls among four replicates. Differential methylation hybridization (DMH) was done using customized 60-mer oligonucleotide microarrays, which contain ~44,000 CpG-rich fragments from ~12,000 promoters of defined genes . Microarray Analysis Suite (MAS) version 5.0 was used for preprocessing. Experimental details were described in . The data discussed in this paper have been deposited in NCBI's Gene Expression Omnibus http://www.ncbi.nlm.nih.gov/geo/ and are accessible through GEO Series accession number (GSE5840 for gene expression and GSE25519 for methylation).
Thus, posterior distribution of β i |D i follows N(K, K * ) where and . Inference is based on the posterior distribution above.
More details about statistical modeling are provided in Additional file 2.
Expectation-Maximization (EM) algorithm [23, 24] is widely used to obtain maximum likelihood estimates when there are unobserved variables. Basically, EM algorithm consists of two iterative steps: Expectation and Maximization. In the E-step, expectation of complete-data log likelihood conditional on data and current value of parameters is calculated. In the M-step, parameters are updated by the value that maximizes the expectation from E-step. Here we briefly describe EM algorithm applied to our case. More details about E-step and M-step are given in Additional file 2.
Here is complete-data log likelihood function where θ is the parameter vector.
In this step, we update θ by values that maximize the target function, Q(θ; θ (k - 1)) given in the E-step.
Inference on Relationship between Gene Expression and Methylation
which can be easily calculated through a linear transformation of (μ i1, μ i2, η i1, η i2) t .
To characterize the correlation of gene expression and DNA methylation for each gene, we first divide the two-dimensional sample space of into nine categories by applying two thresholds to each of the and dimensions. The nine categories represent the combination of three levels of alteration in gene expression and DNA methylation: up-regulation, no change, down-regulation. For instance, the north-east region will be "up-regulation in both expression and DNA methylation". The thresholds are chosen to be ± C * σ, where σ is the standard deviation of the posterior mean of or across all genes. In our application, we chose C = 1.5. We then calculate for each gene the posterior probability of each of the nine regions, which characterizes the correlation of gene expression and DNA methylation for each gene in a probabilistic manner. Based on these probabilities, we will assign each gene to one of the nine categories. See result section for details.
Results and Discussion
Association between Gene Expression and Methylation Status
The main output from our model is a joint posterior distribution of the difference of expression and methylation levels between drug-resistant cell lines and WT for each gene. Such a distribution provides us with a probabilistic measure on the strength of the association for each gene. On the other hand, the center (or the mean) of the prior distribution of the difference of expression and methylation provides us with a global view of the association across genes. In the following discussion, up and down regulation is always in reference to the WT.
Gene assignment to nine category
Gene assignment to nine category
Not surprisingly, the category NG/NM contains most of the genes. The other feature is that very few genes are hypermethylated, similar to what was observed in the original report of the experiments . While the reason for this is not clear, one possibility is that hypomethylation and up regulation of the corresponding gene(s) may provide the drug-resistant cells with a survival and growth advantage. Among those hypomethylated genes with expression alteration, the majority are up-regulated, consistent with what is known regarding promoter methylation and gene expression. The hypomethylated, down-regulated genes suggest other mechanisms are involved in regulating expression in addition to DNA methylation, such as the repressive histone methylation [25–31].
It is well known that the interplay of histone modification and DNA methylation affects the transcriptional regulation [25–29]. To examine the involvement of histone methylation in the association of alterations of DNA methylation and gene expression, we analyzed some in-house histone methylation data. The data were generated by chromatin-immunoprecipitation and high-throughput sequencing (ChIP-seq). The experimental protocol followed the same line of procedure reported previously [32, 33]. We will focus our discussion on the dimethylation on lysine residue 4 on H3 (H3K4me2) on OHT MCF7 cell lines. Our data include 26443 genes with two replicates. We first compared H3K4me2 levels in OHT between genes with DNA hypomethylation and those without alteration in DNA methylation, i.e. the third row versus the second row in Table 1 (maximum probability rule is used to assign genes to a category). The fold change is 1.10 (95% CI: 1.00-1.20). Therefore, DNA hypomethylation is associated with enhanced H3K4me2 in this setting. This observation is intuitively appealing as both DNA hypomethylation and H3K4me2 were found to be related to gene activation [27, 28, 30] and .
We next compared the H3K4me2 levels for (i) genes in the UP/HO category versus those in the DN/HO category; (ii) UP/NM category versus DN/NM category (Table 1). The fold changes are 1.56 (95% CI 0.89-2.72) and 1.24 (95% CI: 0.82-1.88), respectively. Therefore, consistent with previous findings [27, 28, 30, 31], our results show that H3K4me2 is likely to be associated with transcriptional activation. Moreover, there seems to be a higher level of H3K4me2 change in genes with DNA hypomethylation than those without DNA methylation change, suggesting an interaction between H3K4me2 and DNA methylation in regulating gene expression.
Gene Ontology Analysis
Gene Ontology Analysis
Connective Tissue Development & Function
Connective Tissue Development & Function
Immune Cell Trafficking
Nervous System Development & Function
Cell-To-Cell Signaling and Interaction
Hematological System Development & Function
Cellular Assembly and Organization
Cellular Function and Maintenance
In this article, we developed an empirical Bayes model to study the association between altered DNA methylation in the promoter region and gene expression by comparing WT with OHT and ICI resistant MCF7 breast cancer cell lines. Our statistical model incorporates various sources of variations that generate probabilistic characterization of such an association. The model structure also allows a natural incorporation of other epigenetic processes to investigate their regulatory roles in acquired antiestrogen resistance.
Our models are characterized by a hierarchical structure that has been shown to be more efficient and stable than analysis of individual gene separately . It also allows one to estimate the correlation between gene expression and DNA methylation at the level of individual genes. However, our models induce a marginally positive correlation between probes of the same gene, which might not hold for all genes and all microarray platforms. A small simulation study (data not shown) suggests that the inference on gene level quantity μ il and η il is relatively robust when probes are actually negatively correlated. Finally, given the complexity of our model it is not possible to use standard diagnostic tools to check model assumptions. Nevertheless, it is still possible to examine posterior quantities of latent variable that is conditional on the data and parameter at their estimated values. See Additional file 5, 6, 7, 8, 9, 10 for details. Consistent with original publication of the data , our results showed that almost all DNA methylation alterations were in the direction of reduction when resistant cell lines were compared with wild type, suggesting a homogenous pattern of DNA methylation during the acquisition of drug resistance. Furthermore, the OHT and ICI cell lines shared similar yet holded unique association patterns. It is noted that a proportion of genes are hypomethylated with down regulation of gene expression, suggesting the involvement of other genetic and epigenetic factors in the regulation process.
Although there exists a weak correlation between DNA methylation at promoter regions and gene expression for the three cell lines studied, the correlation of methylation and gene expression alterations, when comparing OHT/ICI to WT at the global level, is essentially 0. This implies that the relation between alterations in DNA methylation at promoter region and gene expression is gene-specific and, likely due to the involvement of other factors.
This study was supported by National Institutes of Health [U54 CA113001-06] and Department of Defense [BC030400].
- Baylin SB, Herman JG: DNA hypermethylation in tumorigenesis: epigenetics joins genetics. Trends Genet. 2000, 16: 168-174. 10.1016/S0168-9525(99)01971-X.View ArticlePubMedGoogle Scholar
- Bird A: DNA methylation patterns and epigenetic memory. Gene Dev. 2002, 16: 6-21. 10.1101/gad.947102.View ArticlePubMedGoogle Scholar
- Herman JG: Hypermethylation of tumor suppressor genes in cancer. Semin Cancer Biol. 1999, 9: 359-367. 10.1006/scbi.1999.0138.View ArticlePubMedGoogle Scholar
- Jones PA, Laird PW: Cancer-epigenetics comes of age. Nat Genet. 1999, 21: 163-167. 10.1038/5947.View ArticlePubMedGoogle Scholar
- Herman JG, Baylin SB: Gene silencing in cancer in association with promoter hypermethylation. New Engl J Med. 2003, 349: 2042-2054. 10.1056/NEJMra023075.View ArticlePubMedGoogle Scholar
- Hinshelwood RA, Clark SJ: Breast cancer epigenetics: normal human mammary epithelial cells as a model system. J Mol Med. 2008, 86: 1315-1328. 10.1007/s00109-008-0386-3.View ArticlePubMedGoogle Scholar
- Yuan G, Ma P, Zhong W, Liu JS: Statistical assessment of the global regulatory role of histone acetylation in Saccharomyces cerevisiae. Gen Biol. 2006, 7: R70-10.1186/gb-2006-7-8-r70.View ArticleGoogle Scholar
- Rideout WMIII, Coetzee GA, Olumi AF, Jones PA: 5-Methylcytosine as an endogenous mutagen in the human LDL receptor and p 53 genes. Science. 1990, 249: 1288-1290. 10.1126/science.1697983.View ArticlePubMedGoogle Scholar
- Iguchi-Ariga SM, Schaffner W: CpG methylation of the cAMP-responsive enhancer/promoter sequence TGACGTCA abolishes specific factor binding as well as transcriptional activation. Gene Dev. 1989, 3: 612-619. 10.1101/gad.3.5.612.View ArticlePubMedGoogle Scholar
- Molloy PL, Watt F: DNA methylation and specific protein-DNA interactions. Philos Trans R Soc Lond B. 1990, 326: 267-275. 10.1098/rstb.1990.0010.View ArticleGoogle Scholar
- Clouaire T, Stancheva I: Methyl-CpG binding proteins: specialized transcriptional repressors or structural components of chromatic?. Cell Mol Life Sci. 2008, 65: 1509-1522. 10.1007/s00018-008-7324-y.View ArticlePubMedPubMed CentralGoogle Scholar
- Li M, Balch C, Montgomery JS, Jeong M, Chung JH, Yan P, Huang TH, Kim S, Nephew KP: Integrated analysis of DNA methylation and gene expression reveals specific signaling pathways associated with platinum resistance in ovarian cancer. BMC Med Genom. 2009, 2: 2:34.Google Scholar
- Ottaviano YL, Issa JP, Parl FF, Smith HS, Baylin SB, Davidson NE: Methylation of the estrogen receptor gene CpG island marks loss of estrogen receptor expression in human breast cancer cells. Cancer Res. 1994, 54: 2552-2555.PubMedGoogle Scholar
- Das PM, Singal R: DNA methylation and cancer. J Clin Oncol. 2004, 22: 4632-4642. 10.1200/JCO.2004.07.151.View ArticlePubMedGoogle Scholar
- Dwivedi RS, Qiu YY, Devine J, Mirkin BL: Role of DNA methylation in acquired drug resistance in neuroblastoma tumors. Proc Indian Nat Sci Acad. 2003, 69: 111-120.Google Scholar
- Anderson E: The role of estrogen and progesterone receptors in human mammary development and tumorigenesis. Breast Cancer Res. 2002, 4: 197-201. 10.1186/bcr452.View ArticlePubMedPubMed CentralGoogle Scholar
- Howell A, Abram P: Clinical development of fulvestrant ('Faslodex'). Cancer Treat Rev. 2005, 31: S3-9. 10.1016/j.ctrv.2005.08.010.View ArticlePubMedGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B Met. 1995, 57: 289-300.Google Scholar
- Newton MA, Noueiry A, Sarkar D, Ahlquist P: Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics. 2004, 5: 155-176. 10.1093/biostatistics/5.2.155.View ArticlePubMedGoogle Scholar
- Fan M, Yan PS, Hartman FC, Chen L, Paik H, Oyer SL, Salisbury JD, Cheng AS, Li L, Abbosh PH, Huang TH, Nephew KP: Diverse gene expression and DNA methylation profiles correlate with differential adaptation of breast cancer cells to the antiestrogens Tamoxifen and Fulvestrant. Cancer Res. 2006, 66: 11954-11966. 10.1158/0008-5472.CAN-06-1666.View ArticlePubMedGoogle Scholar
- Smith AFM: A general Bayesian linear model. J R Stat Soc B. 1973, 35: 67-75.Google Scholar
- Wang CS, Rutledge JJ, Gianola D: Bayesian analysis of mixed linear models via Gibbs sampling with an application to litter size in Iberian pigs. Genet Sel Evol. 1994, 26: 91-115. 10.1186/1297-9686-26-2-91.View ArticlePubMed CentralGoogle Scholar
- McLachlan GJ, Krishnan T: The EM Algorithm and Extensions. 2007, WileyGoogle Scholar
- Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B Met. 1977, 39: 1-38.Google Scholar
- Cedar H, Bergman Y: Linking DNA methylation and histone modification: patterns and paradigms. Nat Rev Genet. 2009, 10: 295-304. 10.1038/nrg2540.View ArticlePubMedGoogle Scholar
- Jones PA, Baylin SB: The epigenomics of cancer. Cell. 2007, 128: 683-692. 10.1016/j.cell.2007.01.029.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang Y, Reinberg D: Transcription regulation by histone methylation: interplay between different covalent modifications of the core histone tails. Gene Dev. 2001, 15: 2343-2360. 10.1101/gad.927301.View ArticlePubMedGoogle Scholar
- Strahl BD, Ohba R, Cook RG, Allis CD: Methylation of histone H3 at lysine 4 is highly conserved and correlates with transcriptionally active nuclei in tetrahymena. Proc Natl Acad Sci. 1999, 96: 14967-14972. 10.1073/pnas.96.26.14967.View ArticlePubMedPubMed CentralGoogle Scholar
- Rea S, Elsenhaber F, O'Carroll D, Strahl BD, Sun Z, Schmid M, Opravil S, Mechtler K, Ponting CP, Allis CD, Jenuwein T: Regulation of chromatin structure by site-specific histone H3 methyltransferases. Nature. 2000, 406: 593-599. 10.1038/35020506.View ArticlePubMedGoogle Scholar
- Li B, Carey M, Workman J: The role of chromatin during transcription. Cell. 2007, 128: 707-719. 10.1016/j.cell.2007.01.015.View ArticlePubMedGoogle Scholar
- Lee MG, Villa R, Trojer P, Norman J, Yan KP, Reinberg D, Di CL, Shiekhattar R: Demethylation of H3K27 regulates polycomb recruitment and H2A ubiquitination. Science. 2007, 318: 447-450. 10.1126/science.1149042.View ArticlePubMedGoogle Scholar
- Lee TI, Johnstone SE, A YR: Chromatin immunoprecipitation and microarray-based analysis of protein location. Nat Protoc. 2006, 1: 729-748. 10.1038/nprot.2006.98.View ArticlePubMedPubMed CentralGoogle Scholar
- Feng W, Liu Y, Wu J, Nephew KP, Huang TH, Li L: A Poisson mixture model to identify changes in RNA polymerase II binding quantity using high-throughput sequencing technology. BMC Genom. 2008, 9: S2-S23.View ArticleGoogle Scholar
- Ji H, Liu S: Analyzing omics data using hierarchical models. Nat Biotechnol. 2010, 28: 337-340. 10.1038/nbt.1619.View ArticlePubMedPubMed CentralGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1755-8794/3/55/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.