In this study, we performed a genome-wide analysis of DNA copy number and gene expression changes in gastric cancer to identify genes whose expression are deregulated due to altered copy number and to find potential molecular markers with biological roles in gastric carcinogenesis. Using oligo-based aCGH, gene expression microarrays as well as bioinformatics methods, we acquired genes that were differentially expressed in association with copy number variations. Diverse copy number profiles of different gastric cancer TNM-stages (T1-2 vs. T3-4 and N0 vs. N1-3) and histological subtypes (PD vs. MD) were also shown, implicating the identified copy number regions with valuable biomarkers in diagnostics and in selecting therapy modalities for different gastric cancer subtypes.
On the whole, we identified recurrent copy number gains in 15 chromosomal regions and losses in 5 chromosomal regions which were consistent with the previously published studies
[11–20]. Noticeably, gain at 8p11-q24 was detected at the highest frequency (70%) and 20q11-q13 at the second (63%). Taken together, we speculated that the identified CNVs, especially gain at 8q11-q24 as well as including candidate genes (SULF1, PRKDC, LAPTM4B, GRINA, FAM91A1, GPR172A, PPM2C, MCM4, ENY2, RAD21, SIAHBP1, SLC25A32, PTDSS1, ATP6V1C1, INTS8, and so on) (Figure
5), may play an important biological role in the pathogenesis of gastric cancer. Indeed, a detailed genomic analysis of chromosome 8q has been performed on gastroesophageal junction (GEJ) adenocarcinomas and this study revealed other genes (ANXA13, MTSS1, FAM84B, C8orf17, and PTK2) except MYC involved in the 8q amplification and the pathology of GEJ adenocarcinomas
In addition, it was the first time for this findings that expression of MCM4, PRKDC, and UBE2V2 at 8q11.21, or YWHAZ, ANKRD46, ZNF706, and GRHL2 at 8q22.3 was co-regulation and was concordantly up-regulated in the samples of gastric cancer with amplification at 8q11.21 or 8q22.3. MCM4 is one of the highly conserved mini-chromosome maintenance proteins (MCM) that are essential for the initiation of eukaryotic genome replication and is highly expressed in esophageal cancer and cervical squamous cell carcinoma
[21, 22]. Although negative DNA-PKcs (DNA-dependent protein kinase catalytic subunit, also known as PRKDC) expression has been reported to be found in about 20% (114/564) of human gastric cancers and be associated with gastric cancer progression and poor patient survival, especially for stage I gastric cancer patients
[23, 24], it is positively expressed in 36.8% (82/223) of nasopharyngeal carcinoma tissues and is in association with low 5-year overall survival rate
. In our study, PRKDC was up-regulated (at least a two-fold change in the gene expression level) in 64% (16/25) of gastric cancer samples. Details of its expression in human cancer are controversial, so further studies will be needed to clarify the mechanism for PRKDC. It has been reported that hMMS2 (methyl methanesulfonate sensitive 2, S. cerevisiae, homolog of, also known as UBE2V2) serves a redundant role in human PCNA polyubiquitination
. Therefore, we speculated that these overexpressed genes located at 8q11.21 may concordantly play an important role in the pathogenesis of gastric cancer. Indeed, a recent study has also shown that genes located adjacent to EGFR at 7p11 or SMAD4 at 18q21 were in close association with one another and may play a role in the pathogenesis of advanced gastric carcinoma
. Although YWHAZ at 8q22.3 has been considered as a potent antiapoptotic gene
, we cannot exclude the possibility that other candidate genes may also be present in the region.
Gastric cancers of different TNM-stages or histological subtypes display diverse copy number aberrations. In our study, the MD type tended to be distinguished by gains of C20orf11 at 20q13.33. It has been reported to the higher frequency of 20q amplifications in intestinal gastric cancer
. A study has also previously shown that copy number gains at 20q are significantly frequent in cell lines derived from tumors of the well-differentiated type
. Genetic divergence was also revealed between the T1-2 and T3-4 stages. We found that 4p16.1, 4p14, 5q21.1, 9q21.13, 10q22.1, and 14q24.2 showed copy number gains in T1-2 and copy number losses in the T3-4 stages. Two regions, 9q22.31 and 22q12.2 both had significant losses in the T3-4 stages. In addition, 9p23 and 15q24.1 were found to be more common gains in N0 and losses in N1-3 type gastric cancers. It was the first time to in detail give DNA copy number profiles of different gastric cancer TNM-stages and histological subtypes. Taken together, these studies provided a new insight about researching pathological classification which is helpful to estimate prognosis or personalized therapy for different gastric cancer subtypes.
On the other hand, we discovered 163 genes whose expression was deregulated in association with copy number variation. Combining the other recent studies, our study revealed 12 overlapping genes: POLR1C (6p21.1), LANCL2 (7p12.1-p11.1), CCT6A (7p12.1-p11.1), MRPS17 (7p12.1-p11.1), SMURF1 (7q21.3-q22.1), COPS6 (7q21.3-q22.1), SQLE (8q11.1-q24.3), RRBP1 (20p12.1-p11.23), SNX5 (20p12.1-p11.23), ID1 (20q11.21-q12), PI3 (20q12-q13.2), and PARD6B (20q12-q13.2) in at least one of the previously published studies
[7–10]. Novel genes included SIAHBP1, ATP6V1C1, SLC25A32, ZFAND1, MCM4, XPO5, PLOD3, PSMA7, EIF3S6, TPD52, NSMCE2, MRPS18A, STK3, and MAD2L1BP with no previous gastric cancer associated reports. Moreover, 17 of the identified genes (CLDN4, SRI, MYC, PRKDC, SLPI, LAPTM4B, MYBL2, YWHAB, YWHAZ, MCM3, SERPINE1, SLC29A1, ID1, CDK6, EIF2C2, PTK2, and GSTA1) have previously been implicated in gastric cancer, and six of the genes (MYC, SBDS, CHCHD7, TOP1, COX6C, and CDK6) are included in the Cancer Gene Census
Based on previous studies
[30–32], we applied 1.3-fold cut-off for selecting genes with alteration in their expression. Moreover, we performed Pearson correlation analysis between copy number and expression for these 163 correlated genes to further highlight them. Out of the genes analyzed, 133 (81.6%) showed statistically significant correlations between DNA copy number and gene expression (Additional file
2: Table S4). According to gene expression fold changes (FC), PI3 showed the highest correlation (FC = 9.8). But its Pearson correlation coefficient was 0.18. So the contradiction revealed that the method applying gene expression fold changes
 to obtain correlated genes was not a strong manner.
To validate the microarray results, four genes (C20orf11, XPO5, PUF60, and PLOD3) were selected for qRT-PCR. The C20orf11 gene displayed copy number correlated overexpression in MD type gastric cancer according to the microarray and qRT-PCR analysis (Figure
3). To our knowledge, no previous report regarding the possible tumor association of C20orf11 has been published. Twa1 (two hybrid-associated protein 1 with RanBPM), also known as C20orf11, was well conserved through evolution and was localized within the nucleus. Interestingly, Twa1 was found to possess the LisH-CTLH motif which is detected in proteins involved in microtubule dynamics, cell migration, nucleokinesis and chromosome segregation
. A study indicated that both Twa1 and hMuskelin comprise a protein complex with RanBPM
. It has been shown that XPO5 (Exportin-5) is key to miRNA biogenesis and may help coordinate nuclear and cytoplasmic processing steps
. Exportin-5 controls Dicer1 expression post-transcriptionally and alterations in miRNA expression can strongly influence cellular physiology
. A recent study has shown that the XPO5 genetic defect traps pre-miRNAs in the nucleus of cancer cells, reduces miRNA processing, and diminishes miRNA-target inhibition. Importantly, the restoration of XPO5 functions reverses the impaired export of pre-miRNAs and has tumor-suppressor features in a subset of cancers with microsatellite instability (MSI+)
. In our study, XPO5 exhibited copy number associated overexpression in gastric cancer. PUF60 (poly-U binding splicing factor 60KDa, also known as FIR and SIAHBP1) has been reported to regulate c-myc transcription through the general transcription factor TFIIH
. A study has displayed that the deficiency of LH3 (lysyl hydroxylase 3, also known as PLOD3) glycosyltransferase activities, especially in the extracellular space, causes growth arrest
. Due to the limitation in the number of samples, correlations between copy number alteration and gene expression level from qRT-PCR were almost lower than the microarray data. In all, qRT-PCR analysis validated the microarray results and highlighted some interesting genes as potential target genes.