This article has Open Peer Review reports available.
A genome-wide DNA methylation study in colorectal carcinoma
© Kibriya et al; licensee BioMed Central Ltd. 2011
Received: 23 March 2011
Accepted: 23 June 2011
Published: 23 June 2011
We performed a genome-wide scan of 27,578 CpG loci covering 14,475 genes to identify differentially methylated loci (DML) in colorectal carcinoma (CRC).
We used Illumina's Infinium methylation assay in paired DNA samples extracted from 24 fresh frozen CRC tissues and their corresponding normal colon tissues from 24 consecutive diagnosed patients at a tertiary medical center.
We found a total of 627 DML in CRC covering 513 genes, of which 535 are novel DML covering 465 genes. We also validated the Illumina Infinium methylation data for top-ranking genes by non-bisulfite conversion q-PCR-based methyl profiler assay in a subset of the same samples. We also carried out integration of genome-wide copy number and expression microarray along with methylation profiling to see the functional effect of methylation. Gene Set Enrichment Analysis (GSEA) showed that among the major "gene sets" that are hypermethylated in CRC are the sets: "inhibition of adenylate cyclase activity by G-protein signaling", "Rac guanyl-nucleotide exchange factor activity", "regulation of retinoic acid receptor signaling pathway" and "estrogen receptor activity". Two-level nested cross validation showed that DML-based predictive models may offer reasonable sensitivity (around 89%), specificity (around 95%), positive predictive value (around 95%) and negative predictive value (around 89%), suggesting that these markers may have potential clinical application.
Our genome-wide methylation study in CRC clearly supports most of the previous findings; additionally we found a large number of novel DML in CRC tissue. If confirmed in future studies, these findings may lead to identification of genomic markers for potential clinical application.
Colorectal carcinoma (CRC) is one of the most common human malignancies worldwide, and an increasing incidence of CRC in Asia has been reported [1, 2]. CRC cells develop several genetic and epigenetic alterations in cancer-related genes to achieve malignant status . Promoter hypermethylation coupled with loss of heterozygosity at the same locus results in loss of gene function in many tumor cells . Alterations in DNA methylation in cancer, in general, have been known for 25 years, including hypomethylation of oncogenes and hypermethylation of tumor suppressor genes . Identification of specific DNA methylation markers would be helpful for understanding pathogenetic mechanisms, as well as for developing new therapeutic strategies. So far most of the studies addressing DNA methylation and cancer have followed the candidate gene approach, or addressed a handful of genes, or have used cell lines [6–8]. Recently Ang et al. used Illumina's GoldenGate array covering 1,505 loci and found a total of 202 loci covering 132 genes to be differentially methylated in CRC . Attempts are being made to classify CRC by methylation patterns that correlate with prognosis [10–15]. A recent study suggests that there may be a significant difference in DNA methylation profiles between cancer cell lines and original tumor tissue emphasizing the need to be cautious in using cell lines as a tumor model for molecular studies of cancer .
To our knowledge, there is no published study from Southeast Asia addressing these molecular features in CRC to better understand the underlying pathology. There is epidemiologic evidence of a link between ethnicity, certain food habits (more red meat, less vegetables) and CRC . With very few exceptions in the tribal areas, the Bangladeshi population is relatively homogenous ethnically and has a more or less similar pattern of food habit. In this study we have used Illumina's Infinium methylation assay to study the methylation status in 27,578 CpG sites covering 14,475 genes in paired CRC and surrounding healthy tissue from Bangladeshi patients with CRC to identify differentially methylated loci involved in CRC.
The samples were collected from surgically removed colonic specimens received by the department of Pathology, Bangabandhu Sheikh Mujib Medical University (BSMMU), Dhaka, Bangladesh during the period of December 2009 to March 2010. All samples were collected by one surgical pathology fellow (MR) from the operating room immediately after the surgical resection. We considered the consecutive 24 eligible cases with histologically confirmed diagnosis of CRC. Histopathology was done independently by two histopathologists (MK & MR), and there was concordance in all 24 cases. For each patient, one sample was collected from the tumor mass, and another sample was taken from the resected unaffected part of the colon about 5-10 cm away from the tumor mass. Thus, from each individual we obtained a pair of tumor and normal tissues. From each site, the tissue was collected as fresh frozen and also in RNA-stabilizing buffer. The samples were shipped on dry ice to the molecular genomics lab at The University of Chicago for subsequent DNA extraction and methylation assay. Patient characteristics are shown in Additional File 1 Table S1. For each patient, we also abstracted key demographic and clinical data and tumor characteristics from hospital medical records. Written informed consent was obtained from all participants. The research protocol was approved by the "Ethical Review Committee, Bangabandhu Sheikh Mujib Medical University", Dhaka, Bangladesh (BSMMU/2010/10096) and by the "Biological Sciences Division, University of Chicago Hospital Institutional Review Board", Chicago, IL, USA (10-264-E).
DNA extraction and quality control
DNA was extracted from fresh frozen tissue using Puregene Core kit (Qiagen, Maryland, USA). The average 260/280 ratio was 1.85. Electropherogram from Agilent BioAnalyzer with Agilent DNA 12000 chips showed the fragment size to be >10000 bp (Additional File 2 Figure S1).
RNA extraction and quality control
RNA was extracted from RNA Later preserved colonic tissue using Ribopure tissue kit (Ambion, USA, Cat# AM1924). Quality was checked on Agilent BioAnalyzer. RNA from two patients showed poor quality and that was also reflected on the microarray data.
Genome-wide methylation assay
The Infinium Methylation-27Assay was used to detect 27,578 CpG sites genome-wide, spanning 14,495 genes. The CpG sites were located within the proximal promoter regions of genes, with the distance to transcription start site (TSS) ranging from 0 to 1499 bp averaged at 389 ± 341 bp. For bisulfite conversion, EZ DNA methylation kit (Zymo Research, USA) was used. Paired samples (CRC and corresponding normal) were processed on the same chip, and all the 48 samples were processed on 4 chips (12 samples/chip) at the same time to avoid batch effect. The Illumina protocol was followed for the methylation assay. A Tecan Evo robot was used for automated sample processing and the chips were scanned on a single BeadArray reader (S-428). Control panel in the BeadStudio analytical software showed excellent intensity for staining (above 15,000), clear clustering for the hybridization probes, good target removal intensity (<400) and satisfactory bisulfite conversion.
Validation of Infinium methylation platform by q-PCR array
We used commercially available custom Methyl profiler PCR array from Qiagen-SABiosciences, which does not require bisulfite conversion. Manufacturer's protocol (http://www.sabiosciences.com/dna_methylation_custom_PCRarray.php) was followed for the assay and ABI7900 RT-PCR instrument was used to read the plates. Basically, the assay relies on differential cleavage of target sequences by two different restriction endonucleases - methyl specific (MS) and methyl dependent (MD), whose activity require either the presence or absence of methylated cytosines respectively in their recognition sequences. Details of the assay are published elsewhere . Using the standard ΔΔCt method the proportion of hypermethylated and intermediately methylated DNA was calculated  using the manufacturer supplied Excel macro spreadsheet.
High density SNP assay
We used Illumina 610 Quad BeadChip (Illumina Inc.) to obtain the copy number data from a total of 620,901markers (592,532 SNPs and 28,369 CNV probes).
Genome-wide gene expression microarray
We used HT12 v4 BeadChip (Illumina Inc.) for gene expression. The chip contains a total of 47,231 probes covering 31,335 genes. Paired samples were processed in same chip (12 samples/chip) and all 48 samples were processed in a single batch using 4 chips to minimize batch effect.
To compare the continuous variables (e.g. number of detected loci/samples or average signal intensity/average β value etc. among the two groups), we used one-way analysis of variance (ANOVA).
Genome-wide Methylation data analysis
where Yijklm represents the m-th observation on the i-th Tissue j-th Sex k-th Location l-th CaseID, μ is the common effect for the whole experiment, εijklm represents the random error present in the m-th observation on the i-th Tissue j-th Sex k-th Location l-th CaseID. The errors εijklm are assumed to be normally and independently distributed with mean 0 and standard deviation δ for all measurements.
In GO Enrichment analysis, we tested if the genes found to be differentially methylated fell into a Gene Ontology category more often than expected by chance. We used chi-square test to compare "number of significant genes from a given category/total number of significant genes" vs. "number of genes on chip in that category/total number of genes on the microarray chip". Negative log of the p-value for this test was used as the enrichment score. Therefore, a GO group with a high enrichment score represents a lead functional group. The enrichment scores were analyzed in a hierarchical visualization and in tabular form.
In addition to looking at differential methylation at the level of individual CpG loci, we also examined the differential methylation of "gene sets" using the Gene Set Enrichment Analysis (GSEA) . Given an a priori defined set of genes S (sharing the same GO category), the goal of GSEA was to determine whether the members of S were randomly distributed throughout the ranked list or primarily found at the top or bottom. Considering the fact that GSEA can look at single variable (unadjusted β-value), we also used GO-ANOVA which offers adjustments for other factors such as "person-to-person" variation, "tissue type" variation etc.
where Y represents the methylation status of a GO-category, μ is the common effect or average methylation of the GO-category, T is the tissue-to-tissue (tumor/healthy) effect, P is the patient-to-patient effect, G is the gene-to-gene effect (differential methylation of genes within the GO-category independent of tissue types), S(T*P) is the sample-to-sample effect (this is a random effect, and nested in tissue and patient) and ε represents the random error.
For the one-level cross validation, the data was first divided into 10 random partitions. In each iteration, 10% of samples were held out for testing while the remaining 90% samples were used to fit the parameters of the model. We also used a 6 × 10 two-level nested cross-validation . In the outer cross-validation, with random 1/6-th of the samples (n = 8) were held out as test samples, and the remaining 40 samples were used in an inner 10-fold cross-validation (1/10-th of these samples (n = 4) were held out at each iteration, and it was repeated 10 times) to determine the optimal predictor variables and other classifier parameters. The model that performed the best on the inner cross-validation was applied to the 8 test samples that were held-out in the outer cross-validation. This was repeated 6 times. Thus we had a total of 10 × 6 or 60 permutations for inner cross-validation and 6 for the outer cross-validation. The inner cross-validation was performed in order to select predictor variables and optimal model parameters, and the outer cross-validation was used to produce overall accuracy estimates for the classifier. Initially we tested several classification methods: (a) K-Nearest Neighbor (KNN) with Euclidean distance measure and 1-neighbor, (b) K-Nearest Neighbor (KNN) with Euclidean distance measure and 3-neighbor (c) nearest centroid with equal prior probability and (d) linear discriminent analysis with equal prior probability. Based on the results (normalized correct rate), we finally used KNN with Euclidean distance measure and 3-neighbor as the classifier, and regarding the number of variables, we tested 1 through 50 variables. For automated variable selection, we used 2-way ANOVA where tissue type and case ID# were used as ANOVA factors. One of the loci with maximum delta-β (DAB2IP) was forced into the model.
Genome-wide Copy number (CN) analysis
BeadStudio normalized intensity values were imported into PARTEK genomic suit . Intensity data from the normal tissue was used as reference for generating copy number data for each marker. Standard PCA and sample histogram were generated as part of QC. Genomic segmentation was done with a setting of minimum of 10 markers, p-value threshold of 0.001 for two neighboring regions having significantly differing means. A segment was considered as amplification if the mean CN was ≥2.5 and deletion if the mean was ≤1.5. The phenotype (CRC or normal) was tested for association with amplification/deletion status of the sample using Pearson's Chi-square test.
Genome-wide Gene Expression analysis
In BeadStudio, quantile normalization was used for the intensity data. PCA detected the arrays from the same two patients as outlier which showed poor RNA quality on Agilent BioAnalyzer. Microarray data from those two patients were excluded from the analysis. Differential gene expression analysis was done using the same mixed model multi-way ANOVA  as in case of methylation analysis. We report genes to be differentially expressed only if that shows at least 1.3 fold change in either direction at FDR 0.05. We used this 1.3 fold as cut-off based on the power calculation from our data. Given the sample size, we had 80% power to detect 90% of the truly differentially genes at 1.3 fold.
Correlation between methylation/gene expression and methylation/copy number
To investigate the effect of DNA methylation on gene expression, we used Spearman's rank to test correlation between the beta-value of a methylation locus and the log2-transformed normalized expression value of a gene within a maximum distance of 2 Kb from that methylation locus. Correlated methylation and expression data were taken from the same samples. In the same way, to see the effect of copy number on gene expression, we also used Spearman's rank to test correlation between the average copy number of a genomic segment and the log2-transformed normalized expression value of a gene overlapping with that genomic segment. Copy number and methylation data were also from the same samples.
Our study was conducted on 24 patients (17 male, 7 female) with CRC with a mean age of 45.5 years (SD 16.8) (Additional File 1 Table S1). There were a total of 27,578 loci covering 14,495 genes (average 1.9 CpG loci per gene) that were studied for methylation status per sample. On average, about 27,511 loci (95% CI 27,488 - 27,534) were detected in each sample at p < 0.05 level. A locus was said to be detected at p < 0.05 level if the mean signal intensity from multiple probes for that CpG locus was significantly higher (at the level of p < 0.05) than the negative control on the same chip. Mean number of loci detected at p < 0.05 in tumor and normal tissue was similar: 27483.88 (SD 114.75) vs. 27507.25 (SD 65.84) (p = 0.41). There was a very strong correlation (r2 = 0.9932) of the total signal intensity (methylated and unmethylated) of the 27,578 loci between the 24 normal tissues and corresponding 24 tumor tissues suggesting uniform amplification and hybridization for all samples. However, when the average β of tumor tissue samples were plotted against that of corresponding normal tissue samples, there were clearly a number of loci that were differentially methylated in CRC tissues compared to normal tissues. The data discussed in the publication have been deposited in NCBI's Gene Expression Omnibus  and will be accessible through GEO Series accession number GSE29490 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE29490).
Sources of variation in the methylation data
Differential methylation of colon tissue in male and female subjects
When we compared all 27,578 loci in females with those in males, there were a total of 568 loci differentially methylated at FDR 0.01 level. Interestingly, 551 of them (97.0%) were in the X-chromosome, 2 were in the Y-chromosome and 15 were distributed in the autosomes. This finding is explained by the X-inactivation process, in which one of the two copies of genes on the X chromosome in females is silenced. Considering this fact, we excluded all the sex chromosome markers (n = 1092, of which 1085 in X-chromosome and 7 in Y-chromosome) from subsequent analysis for differential methylation in CRC compared to normal. The significance of different sources of variation in the methylation data in autosomes only is shown in Figure 1B.
Differential methylation in colorectal carcinoma tissue compared to adjacent normal colon tissue
In a total set of 48 samples (tumor and corresponding adjacent normal tissue from 24 patients with CRC), we looked at genome-wide differential methylation in CRC tissue compared to normal tissue. There were a total of 26,486 CpG loci in the autosomes covering 13,890 genes. Here we present the analysis of β-value calculated from background normalized signal intensity. No further normalization of derived β-value was used. In the methylation module of BeadStudio, using the in-built Illumina custom model, we found a total of 875 significantly differentially methylated autosomal loci in CRC tissue compared to normal colonic tissue, of which 275 were hypomethylated (DiffScore = < -30 and delta-β = <-0.2) and 600 were hypermethylated (DiffScore > = 30 and delta β> = 0.2). Univariate and unpaired analysis was used for this.
In fact, there was also a very good overlap between the univariate analysis of BeadStudio Methylation module and the mixed model multi way ANOVA analysis. A total of 832 loci were common between the lists generated by univariate Illumina custom model (n = 875) and the list generated by mixed model multi way ANOVA analysis (n = 852) as mentioned above.
If we looked at the multiple loci near a single gene, usually all the loci showed a similar trend. For example, there were six loci for the gene ESR1, and all were hypermethylated in CRC. We also took the average β values of multiple markers from the same gene and looked for differentially methylated genes in CRC (not shown here), and the result was almost similar to what we see with probe level analysis.
GO Enrichment Analyses of the lists of differentially methylated genes in colorectal carcinoma tissue compared to adjacent normal colon tissue
Differential methylation profile at "gene set" level in colorectal carcinoma
Comparisons to colorectal carcinoma signatures from prior studies
We compared our conservative list of 513 differentially methylated genes to lists obtained by a number of previous studies. In a 2010 review paper, Kim et al.  compiled a comprehensive list of differentially methylated genes in CRC tissue and other clinical samples from patients with CRC. These were mainly candidate gene approach based studies. There were a total of 59 unique genes reported to have differential methylation in CRC tissue. Out of those 59 genes, 40 of them were also studied in our present study. In fact there were 245 loci on the chip we used that covered these 40 genes. It may be noted that 17 of those 40 genes (42.5%) were also found in our conservative list of 513 differentially methylated genes. In addition to comparing the gene lists, we also looked at the differential methylation pattern of these 245 loci covering those 40 genes in our setting. A volcano plot (Additional File 3 Figure S2A) clearly shows that most of these loci were differentially methylated, but the number actually depends on the strictness of the criteria used to define differential methylation. In fact 32 of them (80%) were significant at FDR 0.05 level without considering the delta β. Recently Ang et al. published a study on CRC using a genome wide approach using Illumina's GoldenGate methylation panel of 1505 CpG loci . They reported a total of 202 differentially methylated loci in CRC covering 132 genes . Of those 132 genes, 37 were common to our list. But again, if we look at the methylation data, then we see that in our assay there were a total of 376 loci covering those genes and 263 of those loci (70%) were also differentially methylated at FDR 0.05 in our data set (Additional File 3 Figure S2B). Therefore, it is important to take the selection criteria of a list into account while comparing the gene list. Hence, our genome-wide methylation assay not only clearly supports most of the previous findings from the literature, but in addition to that we found a large number of novel differentially methylated loci in CRC tissue compared to surrounding healthy colon tissue. The complete list of differentially methylated loci is presented as additional material (Additional File 1 Table S3) that shows the loci that have been previously reported as well as the novel loci found in present study. Kim et al.  also compiled a list of genes that were reported in the literature to be differentially methylated in other clinical samples (serum/plasma or stool) from CRC patients. The authors reported a total of 19 unique genes (there were 129 loci in the chip we used that covered 18 of these 19 genes) and interestingly 9 (50%) of those genes were also found in our conservative list of 513 differentially methylated genes in CRC. In fact, 63 of the loci covering 14 of those 18 genes (77.8%) were significantly differentially methylated at FDR 0.05 level.
In addition to looking only at CRC methylation signatures in the literature, we also compared our list to that of hypo- and hyper-methylated genes in cancer as a whole. In a 2009 review, Pogribny et al. compiled a list of 38 unique genes that are reported to be hypomethylated in different human cancers . Only four of those (ESR1, HSPE2, TCL1 and TNFRSF8) were common in our list, and we found that all four of them were found to be hypermethylated to some degree in CRC in our study. In contrast these four genes were reported to be hypomethylated in different cancers - endometrial carcinoma, prostate cancer, T-cell Lymphocytic leukemia and in Hodgkin lymphoma respectively.
In a2007 review, Esteller et al.  compiled a list of 47 genes that were reported to be hypermethylated in different human cancers. Eight of those (CDH13, CDKN2A, ESR1, TMEFF2, GATA4, SFRP1, TP73 and SOCS3) were common in our list and all were hypermethylated in CRC in our study.
DNA repair genes are known to be important for the pathogenesis of carcinoma in general. In a review, Ronen and Glickman compiled a list of 261 DNA repair genes . It was interesting to note that none of these 261 genes were common to our list of 513 genes with differential methylation in CRC.
Uses of methylation signatures
We attempted to identify some models for differentiating CRC samples from normal samples based on methylation status. The models were identified using a 2-level nested cross validation method. Though an independent sample set was unavailable for this study, this method provided a means to estimate the accuracy of the models that may be expected in an independent set of samples. However we agree that statistical model can not replace the need of validation in an independent set of real samples.
Results from two-level nested cross-validation
No. of variables in KNN model
Expected test characteristics in another set of data
We also attempted to identify methylation signatures that could differentiate histopathological findings in CRC. Although the present study was not designed to address this issue, we analyzed the 24 CRC samples using different phenotypes, including tumor stage, tumor grade, differentiation of the tumor, tumor infiltration by lymphocytes, extracellular mucin and signet ring cell. Irrespective of the histopathological diagnosis (adenocarcinoma or mucinous adenocarcinoma) and age at diagnosis (= <45 yrs vs. >45 yrs), a total of 14 loci were significantly differentially methylated in CRC of the proximal colon than those of the distal colon. Unsupervised hierarchical clustering of those loci in the CRC samples is shown in the Additional File 4 Figure S3. GO-Enrichment analysis of these 14 genes showed significant enrichment of genes related to "gut morphogenesis".
Results from q-PCR validation
Does the differential methylation status correlate with chromosomal abnormalities and differential gene expression in CRC?
There are only a few studies addressing the genome-wide methylation in colorectal carcinoma [9, 31–33]. Very recently Kim et al  (Feb 2011) and Oster et al  (Mar 2011) used the same commercially available Infinium methylation 27 arrays in CRC and identified differentially methylated sites in Korean and European population respectively. Oster's study used carcinoma and normal tissue from different individuals for methylation analysis, whereas ours used paired tumor and adjacent normal tissue from the same patient. This allowed us to eliminate inter-individual variation in our methylation analysis, which may be one reason why our study detected a larger number of differentially methylated genes in carcinoma. Kim's study compared the methylation in paired samples like ours, but they looked at gene expression in a different set of individuals. While they looked for effect of methylation on gene expression, they could not find statistically significant difference in the mRNA expression level between promoter hypermethylation group and hypomethylation group, whereas we were able to calculate correlation coefficients using paired data in every gene and found several significant correlations. The only other genome-wide methylation study in CRC addressing promoter CpG loci using commercially available array, used much lower density array (only 1505 CpG loci) .
Our study in south-east Asian population suggests that, in comparison to the normal colonic mucosa, the corresponding CRC tissue shows a large number of differentially methylated loci within the CpG islands close to the transcript start site of genes, indicating the role of DNA methylation in the pathogenesis of colorectal carcinoma. The results from our study not only confirm the findings from many previous candidate gene approach based studies, but we also report a large number of novel loci that show differential methylation in CRC.
We noticed the influence of sex on genome-wide methylation that is explained by the X-inactivation process, in which one of the two copies of genes on the X-chromosome in females is silenced. A similar finding was also recently reported by Liu et al. .
Laird et al.  has recently focused on the different statistical issues for methylation data. We applied different normalization methods and found considerable overlap between the results. Use of stringent criteria for selecting differentially methylated loci and the considerable overlap between the results from different analyses, the 2-level cross validation and finally the q-PCR validation in subset suggest that we detected the truly differentially methylated loci in CRC.
Recently Irizarry et al. used a Comprehensive High-throughput Array for Relative Methylation (CHARM) assay to show that most methylation alterations in CRC occur up to 2000 bp away from the CpG islands themselves . Because of the design of the chip used in the present study, we did not have the opportunity to look at the differential methylation at loci > 1500 bp away from the TSS. However, similar to results from Irizarry et al. , we also found that the hypomethylated loci were slightly more upstream than the hypermethylated loci.
The cross-validation results are very encouraging as a potential biomarker, but we have cross-validated only in colon tissues and not in circulating plasma DNA. In the future we would like to test the markers in an independent sample set of circulating plasma or serum DNA in CRC patients and healthy individuals. Recently He et al.  selected three methylation markers from the published literature and tested the practical use of those markers in peripheral blood sample from CRC patients. They found a sensitivity of 81% and a specificity of 90%. We had the advantage of profiling a very large number of CpG loci in paired CRC and normal colonic mucosa tissue, and our 2-level cross validation suggested that the four markers could be used as biomarkers with slightly better test characteristics.
Tanaka et al.  have recently applied an analytical strategy known as structural equation modeling to understanding methylation in CRC. Using a large database of over 800 samples, the authors were able to construct causality pathways of KRAS and BRAF mutations, as well as various phenotypes, on methylation of specific genes. This strategy was not feasible for our current study because of our smaller sample size and because we had not obtained information on KRAS and BRAF mutations. Nonetheless, it will be valuable for our planned future study with an expanded cohort.
Illumina's methylation assay has been compared to other platforms by others and has shown dependable results with the correlation ranging from 0.8 to 0.9 [32, 33, 38]. We also have validated the methylation data form Infinium methylation for 12 of the highly differentially methylated genes in our study and also found similar high correlations with Methyl Profiler assay (see Additional File 1 Table-S4 and Additional File 5 Figure S4). In another study, reproducibility tests of Infinium methylation platform was reported to have correlation greater than 0.98 between technical replicates . We are aware of the fact that Illumina's Methylation27 assay detects the methylation status of on average ~2 CpG sites per gene for most genes. However, for the genes for which there were multiple CpG loci on the array (e.g. ESR1 or DAB2IP), we found all of the loci to be differentially methylated in the same direction. We also validated Illumina's platform in the top-ranking genes by methyl profiler PCR array which is (a) not dependent on bisulfite conversion and also (b) provides an overall methylation status of the target region as opposed to single loci. This paper was focused mainly to look at DML in CRC. However, we have also explored the link between chromosomal abnormalities (copy number), methylation and gene expression. Regulation of gene expression is complex and is not dependant only on methylation status or copy number status. Using integration of molecular cytogenetics, genome-wide copy number and expression microarray profiling, Camps et al have demonstrated the effect of copy number on gene expression in CRC . To our knowledge, our study is the first one to comprehensively look at the genome-wide methylation, copy number and gene expression - all three together in primary CRC tissue. In our study, expression of a small proportion of genes was found to be correlated to methylation and another small proportion was correlated to copy number changes seen in CRC. Although methylation status of many loci could not explain the functional relevance to gene expression, these promoters methylation may be used clinically as biomarkers.
Our genome-wide methylation study in CRC clearly supports most of the previous findings from the literature, and in addition to that we found a large number of novel DML in CRC tissue, some of which may be used for clinical application. Further study is warranted to confirm these findings.
This work was supported by the National Institutes of Health grants U01 CA122171, P30 CA 014599, P42ES010349, R01CA102484, and R01CA107431.
- Soon MS, Soon A, Lin TY, Lin OS: Distribution of colon neoplasia in Chinese patients: implications for endoscopic screening strategies. Eur J Gastroenterol Hepatol. 2008, 20 (7): 642-647. 10.1097/MEG.0b013e3282f6482b.View ArticlePubMedGoogle Scholar
- Sung JJ, Lau JY, Goh KL, Leung WK: Increasing incidence of colorectal cancer in Asia: implications for screening. Lancet Oncol. 2005, 6 (11): 871-876. 10.1016/S1470-2045(05)70422-8.View ArticlePubMedGoogle Scholar
- Vogelstein B, Kinzler KW: Cancer genes and the pathways they control. Nat Med. 2004, 10 (8): 789-799. 10.1038/nm1087.View ArticlePubMedGoogle Scholar
- Jones PA, Baylin SB: The fundamental role of epigenetic events in cancer. Nat Rev Genet. 2002, 3 (6): 415-428.PubMedGoogle Scholar
- Feinberg AP, Tycko B: The history of cancer epigenetics. Nat Rev Cancer. 2004, 4 (2): 143-153. 10.1038/nrc1279.View ArticlePubMedGoogle Scholar
- Ogino S, Kawasaki T, Nosho K, Ohnishi M, Suemoto Y, Kirkner GJ, Fuchs CS: LINE-1 hypomethylation is inversely associated with microsatellite instability and CpG island methylator phenotype in colorectal cancer. Int J Cancer. 2008, 122 (12): 2767-2773. 10.1002/ijc.23470.View ArticlePubMedPubMed CentralGoogle Scholar
- Tanaka H, Deng G, Matsuzaki K, Kakar S, Kim GE, Miura S, Sleisenger MH, Kim YS: BRAF mutation, CpG island methylator phenotype and microsatellite instability occur more frequently and concordantly in mucinous than non-mucinous colorectal cancer. Int J Cancer. 2006, 118 (11): 2765-2771. 10.1002/ijc.21701.View ArticlePubMedGoogle Scholar
- Weisenberger DJ, Siegmund KD, Campan M, Young J, Long TI, Faasse MA, Kang GH, Widschwendter M, Weener D, Buchanan D, et al: CpG island methylator phenotype underlies sporadic microsatellite instability and is tightly associated with BRAF mutation in colorectal cancer. Nat Genet. 2006, 38 (7): 787-793. 10.1038/ng1834.View ArticlePubMedGoogle Scholar
- Ang PW, Loh M, Liem N, Lim PL, Grieu F, Vaithilingam A, Platell C, Yong WP, Iacopetta B, Soong R: Comprehensive profiling of DNA methylation in colorectal cancer reveals subgroups with distinct clinicopathological and molecular features. BMC Cancer. 2010, 10: 227-10.1186/1471-2407-10-227.View ArticlePubMedPubMed CentralGoogle Scholar
- Aoyagi H, Iida S, Uetake H, Ishikawa T, Takagi Y, Kobayashi H, Higuchi T, Yasuno M, Enomoto M, Sugihara K: Effect of classification based on combination of mutation and methylation in colorectal cancer prognosis. Oncol Rep. 2011Google Scholar
- Chang E, Park DI, Kim YJ, Kim BK, Park JH, Kim HJ, Cho YK, Sohn CI, Jeon WK, Kim BI, et al: Detection of colorectal neoplasm using promoter methylation of ITGA4, SFRP2, and p16 in stool samples: a preliminary report in Korean patients. Hepatogastroenterology. 2010, 57 (101): 720-727.PubMedGoogle Scholar
- Kaneda A, Yagi K: Two groups of DNA methylation markers to classify colorectal cancer into three epigenotypes. Cancer Sci. 2011, 102 (1): 18-24. 10.1111/j.1349-7006.2010.01712.x.View ArticlePubMedGoogle Scholar
- Ogino S, Cantor M, Kawasaki T, Brahmandam M, Kirkner GJ, Weisenberger DJ, Campan M, Laird PW, Loda M, Fuchs CS: CpG island methylator phenotype (CIMP) of colorectal cancer is best characterised by quantitative DNA methylation analysis and prospective cohort studies. Gut. 2006, 55 (7): 1000-1006. 10.1136/gut.2005.082933.View ArticlePubMedPubMed CentralGoogle Scholar
- Shima K, Nosho K, Baba Y, Cantor M, Meyerhardt JA, Giovannucci EL, Fuchs CS, Ogino S: Prognostic significance of CDKN2A (p16) promoter methylation and loss of expression in 902 colorectal cancers: Cohort study and literature review. Int J Cancer. 2011, 128 (5): 1080-1094. 10.1002/ijc.25432.View ArticlePubMedPubMed CentralGoogle Scholar
- Yagi K, Akagi K, Hayashi H, Nagae G, Tsuji S, Isagawa T, Midorikawa Y, Nishimura Y, Sakamoto H, Seto Y, et al: Three DNA methylation epigenotypes in human colorectal cancer. Clin Cancer Res. 2010, 16 (1): 21-33. 10.1158/1078-0432.CCR-09-2006.View ArticlePubMedGoogle Scholar
- Houshdaran S, Hawley S, Palmer C, Campan M, Olsen MN, Ventura AP, Knudsen BS, Drescher CW, Urban ND, Brown PO, et al: DNA methylation profiles of ovarian epithelial carcinoma tumors and cell lines. PLoS One. 2010, 5 (2): e9359-10.1371/journal.pone.0009359.View ArticlePubMedPubMed CentralGoogle Scholar
- Reedy J, Wirfalt E, Flood A, Mitrou PN, Krebs-Smith SM, Kipnis V, Midthune D, Leitzmann M, Hollenbeck A, Schatzkin A, et al: Comparing 3 dietary pattern methods--cluster analysis, factor analysis, and index analysis--With colorectal cancer risk: The NIH-AARP Diet and Health Study. Am J Epidemiol. 2010, 171 (4): 479-487. 10.1093/aje/kwp393.View ArticlePubMedGoogle Scholar
- Ordway JM, Bedell JA, Citek RW, Nunberg A, Garrido A, Kendall R, Stevens JR, Cao D, Doerge RW, Korshunova Y, et al: Comprehensive DNA methylation profiling in a human cancer genome identifies novel epigenetic targets. Carcinogenesis. 2006, 27 (12): 2409-2423. 10.1093/carcin/bgl161.View ArticlePubMedGoogle Scholar
- Downey T: Analysis of a multifactor microarray study using Partek genomics solution. Methods Enzymol. 2006, 411: 256-270.View ArticlePubMedGoogle Scholar
- Eisenhart C: The assumptions underlying the analysis of variance. Biometrics. 1947, 3 (1): 1-22. 10.2307/3001534.View ArticlePubMedGoogle Scholar
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005, 102 (43): 15545-15550. 10.1073/pnas.0506580102.View ArticlePubMedPubMed CentralGoogle Scholar
- Tibshirani RJ, Efron B: Pre-validation and inference in microarrays. Stat Appl Genet Mol Biol. 2002, 1: Article1.PubMedGoogle Scholar
- Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002, 30 (1): 207-210. 10.1093/nar/30.1.207.View ArticlePubMedPubMed CentralGoogle Scholar
- Xie D, Gore C, Liu J, Pong RC, Mason R, Hao G, Long M, Kabbani W, Yu L, Zhang H, et al: Role of DAB2IP in modulating epithelial-to-mesenchymal transition and prostate cancer metastasis. Proc Natl Acad Sci USA. 2010, 107 (6): 2485-2490. 10.1073/pnas.0908133107.View ArticlePubMedPubMed CentralGoogle Scholar
- Xie D, Gore C, Zhou J, Pong RC, Zhang H, Yu L, Vessella RL, Min W, Hsieh JT: DAB2IP coordinates both PI3K-Akt and ASK1 pathways for cell survival and apoptosis. Proc Natl Acad Sci USA. 2009, 106 (47): 19878-19883.View ArticlePubMedPubMed CentralGoogle Scholar
- Garmy-Susini B, Avraamides CJ, Schmid MC, Foubert P, Ellies LG, Barnes L, Feral C, Papayannopoulou T, Lowy A, Blair SL, et al: Integrin alpha4beta1 signaling is required for lymphangiogenesis and tumor metastasis. Cancer Res. 2010, 70 (8): 3042-3051. 10.1158/0008-5472.CAN-09-3761.View ArticlePubMedPubMed CentralGoogle Scholar
- Kim MS, Lee J, Sidransky D: DNA methylation markers in colorectal cancer. Cancer Metastasis Rev. 2010, 29 (1): 181-206. 10.1007/s10555-010-9207-6.View ArticlePubMedGoogle Scholar
- Pogribny IP, Beland FA: DNA hypomethylation in the origin and pathogenesis of human diseases. Cell Mol Life Sci. 2009, 66 (14): 2249-2261. 10.1007/s00018-009-0015-5.View ArticlePubMedGoogle Scholar
- Esteller M: Cancer epigenomics: DNA methylomes and histone-modification maps. Nat Rev Genet. 2007, 8 (4): 286-298. 10.1038/nrg2005.View ArticlePubMedGoogle Scholar
- Ronen A, Glickman BW: Human DNA repair genes. Environ Mol Mutagen. 2001, 37 (3): 241-283. 10.1002/em.1033.View ArticlePubMedGoogle Scholar
- Irizarry RA, Ladd-Acosta C, Wen B, Wu Z, Montano C, Onyango P, Cui H, Gabo K, Rongione M, Webster M, et al: The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat Genet. 2009, 41 (2): 178-186. 10.1038/ng.298.View ArticlePubMedPubMed CentralGoogle Scholar
- Kim YH, Lee HC, Kim SY, Yeom YI, Ryu KJ, Min BH, Kim DH, Son HJ, Rhee PL, Kim JJ, et al: Epigenomic Analysis of Aberrantly Methylated Genes in Colorectal Cancer Identifies Genes Commonly Affected by Epigenetic Alterations. Ann Surg Oncol. 2011Google Scholar
- Oster B, Thorsen K, Lamy P, Wojdacz TK, Hansen LL, Birkenkamp-Demtroder K, Sorensen KD, Laurberg S, Orntoft TF, Andersen CL: Identification and validation of highly frequent CpG island hypermethylation in colorectal adenomas and carcinomas. Int J Cancer. 2011Google Scholar
- Liu J, Morgan M, Hutchison K, Calhoun VD: A study of the influence of sex on genome wide methylation. PLoS One. 2010, 5 (4): e10028-10.1371/journal.pone.0010028.View ArticlePubMedPubMed CentralGoogle Scholar
- Laird PW: Principles and challenges of genome-wide DNA methylation analysis. Nat Rev Genet. 2010, 11 (3): 191-203.View ArticlePubMedGoogle Scholar
- He Q, Chen HY, Bai EQ, Luo YX, Fu RJ, He YS, Jiang J, Wang HQ: Development of a multiplex MethyLight assay for the detection of multigene methylation in human colorectal cancer. Cancer Genet Cytogenet. 2010, 202 (1): 1-10. 10.1016/j.cancergencyto.2010.05.018.View ArticlePubMedGoogle Scholar
- Tanaka N, Huttenhower C, Nosho K, Baba Y, Shima K, Quackenbush J, Haigis KM, Giovannucci E, Fuchs CS, Ogino S: Novel application of structural equation modeling to correlation structure analysis of CpG island methylation in colorectal cancer. Am J Pathol. 2010, 177 (6): 2731-2740. 10.2353/ajpath.2010.100361.View ArticlePubMedPubMed CentralGoogle Scholar
- Bock C, Tomazou EM, Brinkman AB, Muller F, Simmer F, Gu H, Jager N, Gnirke A, Stunnenberg HG, Meissner A: Quantitative comparison of genome-wide DNA methylation mapping technologies. Nat Biotechnol. 2010, 28 (10): 1106-1114. 10.1038/nbt.1681.View ArticlePubMedPubMed CentralGoogle Scholar
- Weisenberger, D. J. Comprehensive DNA methylation analysis on the Illumina Infinium assay platform (Illumina, San Diego, California, USA, 2008). [http://www.illumina.com/Documents/products/appnotes/appnote_infinium_methylation.pdf]
- Camps J, Nguyen QT, Padilla-Nash HM, Knutsen T, McNeil NE, Wangsa D, Hummon AB, Grade M, Ried T, Difilippantonio MJ: Integrative genomics reveals mechanisms of copy number alterations responsible for transcriptional deregulation in colorectal cancer. Genes Chromosomes Cancer. 2009, 48 (11): 1002-1017. 10.1002/gcc.20699.View ArticlePubMedPubMed CentralGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1755-8794/4/50/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.