Skip to main content

A novel DNA damage repair gene-related prognostic model for evaluating the prognosis and tumor microenvironment infiltration of esophageal squamous cell carcinoma



This study aimed to investigate the potential prognostic value of DNA damage repair genes (DDRGs) in esophageal squamous cell carcinoma (ESCC) and their relationship with immune-related characteristics.


We analyzed DDRGs of the Gene Expression Omnibus database (GSE53625). Subsequently, the GSE53625 cohort was used to construct a prognostic model based on least absolute shrinkage and selection operator regression, and Cox regression analysis was used to construct a nomogram. The immunological analysis algorithms explored the differences between the potential mechanism, tumor immune activity, and immunosuppressive genes in the high- and low-risk groups. Of the prognosis model-related DDRGs, we selected PPP2R2A for further investigation. Functional experiments were conducted to evaluate the effect on ESCC cells in vitro.


A 5-DDRG (ERCC5, POLK, PPP2R2A, TNP1 and ZNF350) prediction signature was established for ESCC, stratifying patients into two risk groups. Multivariate Cox regression analysis showed that the 5-DDRG signature was an independent predictor of overall survival. Immune cells such as CD4 T cells and monocytes displayed lower infiltration levels in the high-risk group. Additionally, the immune, ESTIMATE, and stromal scores in the high-risk group were all considerably higher than those in the low-risk group. Functionally, knockdown of PPP2R2A significantly suppressed cell proliferation, migration and invasion in two ESCC cell lines (ECA109 and TE1).


The clustered subtypes and prognostic model of DDRGs could effectively predict the prognosis and immune activity of ESCC patients.

Peer Review reports


Esophageal cancer is one of the most common malignant tumors and seriously threatens human health and well-being [1]. In Asia, esophageal squamous cell carcinoma (ESCC) is the main histologic type of esophageal cancer [2]. Although we have made great progress in multimodality therapies for ESCC, such as surgery, chemoradiotherapy, and targeted drug therapy, the unsatisfactory clinical outcomes of ESCC have not improved appreciably [3]. Immunotherapy has a promising future, and progress has been made in the combined management of ESCC patients [4]. However, resistance to immunotherapy may elicit low response rates with subsequent tumor recurrence and metastasis. Hence, exploration of the underlying biological mechanisms of tumorigenesis and identification of new targets of immunotherapies to improve the prognosis of ESCC are urgently needed.

Various environmental and endogenous risk factors, such as ionizing radiation (IR), alkylating agents, antimetabolites and other chemical factors, can trigger DNA damage [5, 6]. DNA damage repair (DDR) pathways are a network of cellular signaling pathways that recognize and repair DNA damage. The interaction of these DDR pathways can prevent gene distortion and accumulation of DNA damage and ensure the integrity of the genome [7]. Dysfunction of the DDR process promotes cell aging, apoptosis, and tumorigenesis [8, 9]. Lawrence et al. indicated that the balance between DNA damage and DNA repair ability may facilitate genome distortion in malignant cells. Distinguishing these tumor cells from normal cells can improve the treatment response of cancer [10]. Additionally, DDR gene (DDRG) mutations usually result in a high somatic mutation load in malignant cells, and these variations in turn trigger the production of tumor-specific neoantigens [11, 12]. DDRG research can broaden therapy options by exploring the characteristics of gene drivers for cancer patients in translational medicine. McLaughlin et al. found that DDRG mutations could alter inflammation-related signaling pathways, which could reshape the tumor immune status [13]. More importantly, DDRG mutations are emerging as potential biomarkers for predicting tumor prognosis and immunotherapeutic response. Song et al. revealed that DDRG mutations were indicative of a favorable prognosis in colorectal cancer [14]. Chae et al. found that a DDRG signature was predictive of patient prognosis in glioma cells [11]. In addition, studies have shown that high DDRG mutation loads are closely related to CD4+ and CD8+ tumor-infiltrating lymphocytes (TILs), and the co-mutations of homologous recombination repair and mismatch repair (HRR-MMR) in the DDR pathways are potential biomarkers for immune checkpoint inhibitor (ICI) therapy[15]. Although these findings have highlighted the importance of the DDRG signature for tumor prognosis and immunotherapeutic response, the prognosis and immune features of DDRGs in ESCC remain unclear.

To address this issue, we investigated the DDRG signature of ESCC patients and utilized the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases to construct a clustering subtype and risk score model. We first examined the predictive ability of the DDRG signature for ESCC patients. Then, we evaluated the relationship between the DDRG signature and immune-related characteristics. Finally, we conducted cell experiments to further verify the expression of DDRGs (ERCC5, POLK, PPP2R2A, TNP1 and ZNF350). Our results demonstrated that the DDRG signature had significant advantages in predicting prognosis and distinguishing hot tumors to advance the individualization of immunotherapy.

Materials and methods

Data acquisition

The expression profiles for GSE53625 and clinical data were obtained based on the GPL18109 platform from the GEO ( database, which was used as a training cohort. In addition, as an external validation cohort, we extracted RNA-FPKM data and clinical data of TCGA-ESCC from the TCGA database ( ESCC samples with complete clinical outcome time and status were retained in this study. DDRGs were identified from the Molecular Signatures Database (MSigDB, and previously published literature[16].

Establishment and validation of the prognostic model

The prognostic DDRGs with P < 0.05 in the GSE53625 cohort were screened in univariate Cox regression analysis through the “survminer” and “survival” R packages [17, 18]. Subsequently, the least absolute shrinkage and selection operator (LASSO) regression algorithm with 10 times cross validation was performed to construct a risk model by using the “glmnet” R package. The risk score was determined by LASSO regression coefficients following the formula: risk score = \(\sum _{1}^{i}(\)gene Expression gene coefficient). ESCC patients were divided into two risk groups (high- and low-risk groups).

We analyzed the association between the risk score and clinical characteristics, such as age, gender, grade, T stage, N stage, M stage, and TNM stage, using the “ComplexHeatmap” R package. Kaplan‒Meier curves of OS were generated between the two risk groups. Receiver operator characteristic (ROC) curve analysis (including 1-, 3-, and 5-year survival) was plotted to estimate the predictive efficacy of the risk model using the “timeROC” R package[19]. External validation was conducted in the TCGA-ESCC cohort to test the stability of the risk score application.

Independent prognostic analysis and nomogram construction

Univariate and multivariate Cox regression analyses were conducted to analyze the risk score and clinical characteristics of the GSE53625 cohort, and the P value, HR and 95% CI of each enrolled variable were displayed using the “forestplot” R package. Subsequently, the “rms” R package was used to generate a nomogram predictive of OS based on independent prognostic criteria[20]. A nomogram showed the intuitive results of the risk score. Calibration curves were drawn to evaluate the predictive accuracy of survival probabilities (including 1-, 3-, and 5-year survival).

Analysis of biological property and pathway enrichment

Utilizing the “clusterProfiler” R package, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) [21,22,23] analyses of significantly differentially expressed DDRGs were carried out to assess the biological functions and signaling mechanisms. Moreover, gene set enrichment analysis (GSEA) was conducted to determine different biological functions among the two risk groups of patients. Gene set items with normalized P < 0.05 and FDR < 0.25 were regarded as statistically significant.

Assessment of the immune microenvironment

The “tidyverse” R package was used to explore the correlation of tumor-infiltrating immune cells and immunosuppressive genes with 5 prognostic model-related DDRGs, and the results are displayed in a heatmap. Comparisons of the StromalScore, ImmuneScore, and ESTIMATEScore among the two risk groups in the training cohort were calculated using the “ESTIMATE” R package.

Cell culture and transfection

Human esophageal epithelial cells (HEEC) and ESCC cells (ECA109 and TE1) were obtained from the Scientific Research Center of the Fourth Hospital of Hebei Medical University (Shijiazhang, China). Cells were cultured in RPMI 1640 medium (Gibco), supplemented with 10% fetal bovine serum (FBS) and placed in a 37 °C, 5% CO2 incubator. Small interfering (si-PPP2R2A) and negative control (si-NC) siRNAs were used for ESCC cell transfection by RIBOBRO (Guangzhou, China). Following the manufacturer’s instructions, Lipofectamine® 2000 (Invitrogen, USA) was used for transfection. After 48 h, the biological functions of transfected ESCC cells were evaluated.

Quantitative realtime PCR analysis

Total cell RNA was extracted using TRIzol reagent (Thermo Fisher Scientific), and we used a RevertAid First Strand cDNA Synthesis Kit (Thermo Fisher Scientific) to reverse-transcribe to cDNA. MonAmp™ SYBR® Green qPCR Mix was used to perform qRT-PCR to measure the expression of the survival-related DDRGs in esophageal cells. Glyceraldehyde phosphate dehydrogenase (GAPDH) was analyzed as an internal control. The 2−ΔΔCT method was utilized to analyze survival-related DDRG expression. The primer sequences are listed in Table 1.

Table 1 The primer sequences of five DDRGs used for qRT-PCR.

Cell counting kit-8 (CCK-8) proliferation assay

The proliferation ability of ESCC cells was evaluated using the CCK-8 kit (Med Chem Express Princeton, USA). After 48 h of transfection, transfected ECA109 and TE1 cells were inoculated into 96-well plates at a density of 4000 cells per well. At 24, 48, 72, and 96 h, the transfected cells were incubated with 10 µl CCK-8 solution for 2 h. The absorbance in each well for living cells was examined at 450 nm wavelength using a microplate reader.

Wound-healing assay

Transfected ECA109 and TE1 cells were seeded into a 6-well plate. When these cells reached 80% confluence, they were scratched with sterile 200-µl pipette tips. Subsequently, serum-free medium was added to the plates with living cells for 24 h, and the wound healing distance was analyzed under a microscope and photographed.

Transwell assay

A total of 2.0 × 105 cells were placed in the upper chamber with Matrigel, and the cells were cultured in serum-free medium. Then, 600 µl of medium with 20% FBS was added to the lower chamber. After 24 h, these cells migrated across the Matrigel in the upper chamber and were fixed with paraformaldehyde and stained with crystal violet for further analysis. The stained cells on the lower surface of the filter were counted and imaged using a light microscope.

Statistical analysis

Student’s t test was employed to assess the difference between two groups with regard to ImmuneScore, StromalScore, and ESTIMATEScore. We examined the correlation between two variables using Spearman’s correlation analysis. Survival differences were determined using the Kaplan‒Meier method and compared by log-rank tests. R software (version 4.1.2) was used for all bioinformatics analyses and statistical analyses and to generate the corresponding figures. P less than 0.05 denoted statistically significant differences.


Identification of prognostic DDRGs

The flowchart summarizes the procedures of the present work (Fig. 1). Compared to normal samples, abnormal expression levels of DDRGs were observed in ESCC samples (Table 2). The analysis included the GSE53625 cohort and TCGA-ESCC cohort. We analyzed differentially expressed DDRGs in the GSE53625 cohort (179 ESCC vs. 179 normal samples) and used the Wilcoxon test for subsequent analysis. In total, 135 differentially expressed DDRGs were identified for investigation (|logFC|>1, FDR < 0.05). Seventy DDRGs were downregulated, and 65 were upregulated. The top 30 downregulated and upregulated DDRGs are visualized in Fig. 2A. In addition, prognostic analysis was performed for DDRGs in the GSE53625 cohort. The network diagram of prognostic DDRGs indicated that ERCC2, CCNE1, CCR5, USP3, and POLN were favorable factors for OS, and others were risk factors for OS (Fig. 2B).

Fig. 1
figure 1

The flow chart of research process

Table 2 The differentially expressed DDRGs in normal and ESCC samples
Fig. 2
figure 2

The genetic landscape of DDRGs of ESCC in GSE53635 cohort. A The of differentially expressed DDRGs between ESCC and normal samples. B The gene interaction network of DDRGs in ESCC. The circle represents its role (purple circle: risk factor; green circle: favorable factor). red line: positive association; blue line: negative relationship). The pink lines or blue lines between genes means they had positive association or negative association. DDRGs, DNA damage repair genes; ESCC, esophageal squamous cell carcinoma

Identification of the DDR-related prognosis model

Based on the GSE53625 cohort, univariate Cox regression was performed to determine DDRGs affecting survival. A total of 5 prognostic DDRGs (P < 0.05, |HR|>1) were included in the LASSO Cox regression model analysis, including ERCC5, POLK, PPP2R2A, TNP1 and ZNF350 (Additional file 1: Figure S1A–B). A risk model was established based on the expression of 5 DDRGs and their corresponding coefficients: (0.31853) * ERCC5 + (0.28215) * POLK + (− 0.3054) * PPP2R2A + (0.1808) * TNP1 + (0.2069) *  ZNF350 (Additional file 1: Figure S1C). Each patient was divided into a high- or low-risk group according to the optimum cutoff value obtained by means of the “survminer” R package. Fifty genes and their 5 co-expressed DDRGs are depicted in a Sankey plot. The results showed that there was a positive regulatory relationship between 50 genes and their co-expression DDRGs (Additional file 1: Figure S1D).

Correlation analysis of the prognostic model with clinical outcomes

As shown in Fig. 3A–B, patients with a higher percentage of death events were observed in the high-risk group. The heatmap showed that ERCC5, POLK, PPP2R2A, TNP1 and ZNF350 expression levels had significant correlations with risk scores (Fig. 3C). The survival rate of the high-risk patients was significantly poorer than that of the low-risk patients (P < 0.001, Fig. 3D), which is consistent with previous reports. The area under the curve (AUC) values for 1-, 2-, and 3-year prognostic prediction were 0.675, 0.677, and 0.678, respectively (Fig. 3E). There was a significant survival difference between patients among the two risk groups (P < 0.001, Fig. 3F). In addition, the results showed that the correlation between the risk score and other clinical characteristics was not statistically significant (all P > 0.05, Table 3).

Fig. 3
figure 3

Generation of a DDR-related prognostic model for ESCC in the training cohort. A, B Distribution of risk scores and survival status of ESCC patients. C Heat map showing the association of risk score and 5 DDRGs of ESCC patients. D Kaplan-Meier analysis of OS in the high and low-risk groups. E ROC curves of DDR-related prognostic model illustrating the prediction efficiency for 1-, 2-, and 3-year survival. F Distribution of clinicopathological features in the two risk groups. DDR DNA damage repair, ESCC esophageal squamous cell carcinoma, DDRGs DNA damage repair genes, OS Overall survival, ROC Receiver operator characteristic

Table 3 Clinicopathological characteristics of the GSE53625 cohort in prognostic model

Verification of the prognostic model

The TCGA-ESCC cohort was analyzed as an external validation cohort to confirm the predictive values of the developed prognostic model. Based on the median risk score in the GSE53625 cohort, 80 patients in the TCGA-ESCC cohort were divided into the high-risk group (44 patients) and low-risk group (36 patients). The distribution characteristics of ESCC patients in the TCGA-ESCC cohort are shown in Fig. 4A–B. A heatmap showed that the expression level varied systematically among prognostic model-related DDRGs (Fig. 4C). Kaplan‒Meier survival curves demonstrated that the prognostic model could remarkably distinguish the clinical outcomes (P = 0.034, Fig. 4D). In the TCGA-ESCC cohort, ROC curve analysis showed that the AUCs of the 1-, 2-, and 3-year OS were 0.627, 0.792, and 0.671, respectively (Fig. 4E). In addition, the heatmap displayed the relationship between the risk score and clinical characteristics (Fig. 4F).

Fig. 4
figure 4

Generation of a DDR-related prognostic model for ESCC in the validation cohort. A, B Distribution of risk scores and survival status of ESCC patients. C Heat map showing the association of risk score and 5 DDRGs of ESCC patients. D Kaplan‒Meier analysis of OS in the high and low-risk groups. E ROC curves of DDR-related prognostic model illustrating the prediction efficiency for 1-, 2-, and 3-year survival. F Distribution of clinicopathological features in the two risk groups. DDR DNA damage repair, ESCC Esophageal squamous cell carcinoma, DDRGs DNA damage repair genes, OS Overall survival, ROC Receiver operator characteristic

Construction of a prognostic nomogram

To evaluate whether the 5-DDRG prognostic model could be an independent predictor affecting ESCC clinical outcomes, we established a prognostic nomogram combining risk scores with clinical characteristics. The univariate and multivariate Cox regression analyses of the GSE53625 cohort indicated that patient age (P = 0.004, HR = 1.838), TNM stage (P < 0.001, HR = 2.439) and the constructed risk score (P < 0.001, HR = 2.684) were independent prognostic indicators in ESCC patients (Fig. 5A‒B). Based on the results above, a nomogram was established to predict survival at 1, 3 and 5 years (Fig. 5C). A prognostic nomogram was constructed based on the proportion of contribution to the death risk, as shown in Fig. 5D. The AUC of the nomogram for predicting OS was 0.763, which was higher than the values of the risk score and other clinical characteristics (Fig. 5E). Decision curve analysis (DCA) confirmed that this model had the highest net benefit, suggesting that this model can be effectively applied for clinical decision-making (Fig. 5F).

Fig. 5
figure 5

Construction of a nomogram predicting the OS of ESCC patients. A, B Univariate and multivariate Cox regression analysis for the prognostic signature. C The calibration plots of the nomogram for predicting the 1-, 3-, and 5-year survival of ESCC patients. D Construction of a nomogram with risk score and clinicopathological features. E The ROC curves of the nomogram and clinicopathological features. F Decision curve analysis. OS overall survival, ESCC Esophageal squamous cell carcinoma, ROC Receiver operator characteristic

Gene set enrichment analyses of the prognostic model

GO and KEGG functional enrichment analyses of the differentially expressed DDRGs among the two risk groups in the GSE53625 cohort were performed to investigate the underlying molecular heterogeneity. For GO analysis, immune-related biological processes (B-cell receptor, positive regulation of DNA binding, humoral immune response mediated by circulating immunoglobulin, T-cell cytokine production and immune response regulating cell surface receptor signaling pathway) were significantly activated in the high-risk group (Additional file 2: Figure S2A). In contrast, the activities of some immune depletion-related pathways (such as pathways related to neutrophil-mediated immunity, inflammatory response to antigenic stimulus, and positive regulation of intrinsic apoptotic signaling pathways) were significantly enriched in the low-risk group (Additional file 2: Figure S2B). The KEGG ( pathway enrichment results revealed that the genes in the high-risk group were significantly enriched in immune reactivity pathways, such as the TGF-β signaling pathway and antigen processing and presentation (Additional file 2: Figure S2C). Pathways related to linoleic glutathione metabolism and ribosomes converged in the low-risk group (Additional file 2: Figure S2D).

Immune cell infiltration analysis in the prognostic model

We compared the composition of 22 infiltrating immune cells among the two risk groups in the GSE53625 cohort. The results showed that patients in the high-risk group had higher infiltration levels of activated NK cells and resting mast cells than those in the low-risk group, whereas dendritic cell and macrophage infiltration showed no difference (Fig. 6A). The high-risk group showed a higher StromalScore, ImmuneScore and ESTIMATEScore, which revealed that high-risk patients had higher immunogenicity than the low-risk group (Fig. 6B–D). Additionally, the scatter diagrams further verified the correlation of immune cells (activated NK cells, resting mast cells, activated mast cells, naive CD4 T cells and monocytes) and the risk score (Fig. 6E–I).

Fig. 6
figure 6

Immunological profiling of different prognostic model. A Differences in the distribution of tumor-infiltrating immune cells of different risk groups. B The StromalScore, ImmuneScore and ESTIMATEScore in high and low-risk groups. E‒I Correlation between the risk score and tumor-infiltrating immune cells. *P < 0.05, **P < 0.01, and ***P < 0.001

We assessed the impact of immunosuppressive genes on ESCC, and the expression of immunosuppressive genes in the two risk groups was analyzed (Fig. 7A‒B). The results indicated that patients in the high-risk group had higher expression of BTLA, CTLA4, HAVCR2, IL10RB, PDCD1LG2, and TIGIT than patients in the low-risk group. These findings revealed that patients with higher risk may benefit from immunotherapies that target these immunosuppressive genes.

Fig. 7
figure 7

Correlation of the DDR-related prognostic model with the immunosuppressive genes of ESCC patients. A Correlation between risk groups and immunosuppressive genes. B The expression of BTLA, CTLA4, HAVCR2, IL10RB, PDCD1LG2, and TIGIT in high and low-risk groups. *P < 0.05, **P < 0.01, and ***P < 0.001

Validation of DDRGs in the prognostic model

Quantitative real-time polymerase chain reaction (qRT-PCR) analysis was performed to validate the expression of the prognostic model-related DDRGs in ESCC cells (Fig. 8A‒E). The results indicated that the expression of POLK was downregulated in ESCC cells compared to normal esophageal epithelial cells (HEECs). Additionally, ESCC cells had higher expression levels of ERCC5 and PPP2R2A than HEEC cells.

Fig. 8
figure 8

The expression levels of the five DDRGs in cells. AE qRT-PCR analysis showing the expression levels of the ERCC5, POLK, PPP2R2A, TNP1 and ZNF350 in the normal esophageal epithelial cells and ESCC cells. DNA Damage repair genes, ESCC Esophageal squamous cell carcinoma, qRT-PCR quantitative real-time polymerase chain reaction, ns Not significant; *P < 0.05, **P < 0.01, ***P < 0.001

The tumorigenic role of PPP2R2A in ESCC in vitro

We performed qRT‒PCR to analyze the expression level of PPP2R2A after transfection in ESCC cell lines using targeted siRNA (Fig. 9A). We found that PPP2R2A expression in ECA109 and TE1 cells using si-PPP2R2A was markedly decreased compared with that in the si-NC group. CCK-8 assays revealed that knockdown of PPP2R2A significantly inhibited the proliferation ability of ECA109 and TE1 cells (Fig. 9B, C). In addition, the results of wound healing and transwell assays demonstrated that knockdown of PPP2R2A suppressed ECA109 and TE1 cell migration as well as cell invasion (Fig. 9D‒G). Taken together, our findings suggest that PPP2R2A is involved in the tumorigenesis of ESCC and enhances the proliferation, migration and invasion of ECA109 and TE1 cells in vitro.

Fig. 9
figure 9

The PPP2R2A’s biological behaviors in ESCC cells. A Verification of PPP2R2A knockdown efficiency. B, C Proliferation curves assessed by CCK8 assay for PPP2R2A knockdown in ESCC cells. D‒G Wound healing and transwell assay were performed to confirm the biological function of PPP2R2A in ESCC cells. ESCC Esophageal squamous cell carcinoma, ns Not significant; *P < 0.05, **P < 0.01, ***P < 0.001


ESCC is one of the most common malignant tumors with a high recurrence rate and mortality. Aggressive multimodal therapy has remarkably improved outcomes in patients with ESCC; nevertheless, the treatment outcomes are still unsatisfactory. Primary and acquired therapy resistance remains a major obstacle. A comprehensive analysis of the genetic features is critical for the therapeutic evaluation and prognosis of ESCC. A single gene explaining the molecular signatures of cancer may be farfetched and influenced by confounders. Additionally, some obstacles hamper the prediction accuracy of therapeutic evaluation and prognosis. DDR can maintain the integrity of the genome and homeostasis of cells. In contrast, dysfunction of the DDR process can result in genetic information mutations, eventually leading to malignant transformation. However, the genetic features based on DDRGs in ESCC have not yet been clarified. Herein, we constructed a DDRG molecular signature and comprehensively elucidated its role in the therapeutic evaluation and prognosis of ESCC.

The DDR pathway includes direct repair, mismatch repair, base excision repair, homologous recombination, double-strand break repair, and nonhomologous end-joining, which can accurately repair DDR mutations and maintain genetic stability. DDRGs are often inactivated during the stage of cancer initiation and progression, which suggests that cancer cells exhibit poor DDR capability. Consequently, the accumulation of DNA damage in tumor cells increases dramatically and upregulates mutagenic abnormal proteins[24, 25]. Moreover, these abnormal proteins may act as antigens to drive oncogenesis, which increases the probability of occurrence and development of tumors[26]. It has been reported that different DDRG signatures are closely related to the immune response along with the prognosis of cancer patients. A study in 2021 preliminarily explored the effect of the DDRG PKMYT1 on immunity and prognosis in various malignancies[27]. The results showed that PKMYT1 exerted a vital effect on tumor immunity and progression. Because PKMYT1 acts as an individual gene rather than a DDRG signature, these findings may still have certain limitations. Another study established DDRG signatures in cervical squamous cell carcinoma, and Zhou et al. found that a DDRG signature was associated with prognosis and could act as a biomarker for immunotherapies[28]. Precision immunotherapy based on a DDRG signature should be explored, and its role in the clinical response and prognosis of immunotherapy should be identified.

We obtained tumor sequencing data from publicly available databases (GEO database and TCGA database), which presented the transcriptome profiles of ESCC. In previous studies, these databases have been used to access the genetic landscape and can identify novel biomarkers to predict the prognosis of patients with ESCC[29, 30]. In this study, we screened genes with prognostic significance to construct a prognostic model consisting of 5 genes (ERCC5, POLK, PPP2R2A, TNP1 and ZNF350) in the GSE53625 cohort and validated it in independent TCGA-ESCC cohorts. The patients were classified into high- and low-risk groups based on risk scores, which indicated different individual abilities of DDR. The patients in the high-risk group demonstrated a poor survival time compared to the low-risk group. This prognostic model exhibited high predictive accuracy for ESCC survival, especially for 3-year survival (AUC = 0.678). Although this might result from the limited number of our patients, it is important to further explore the factors for ESCC prognosis. More critically, this risk model showed rationality consistency in TCGA-ESCC cohorts.

The above mentioned 5 genes have attracted extensive attention in various malignancies and were selected for experimental validation in the present study. A recent study demonstrated that ERCC5 (key components of the nucleotide excision repair pathway) was remarkably associated with susceptibility to tumors[31]. Li et al. reported that ERCC5 was significantly associated with the response of cisplatinbased chemotherapy of non-small cell lung cancer[32]. Another study reported that the expression level of ERCC5 was significantly increased in hepatocellular carcinoma, and high ERCC5 expression conferred poor prognosis[33]. POLK is associated with cancer cell proliferation and participates in platinum-chemotherapy tolerance in lung cancer[34]. Considering the higher fold change in the expression level of PPP2R2A, PPP2R2A was selected for further functional assays. Our results revealed that knockdown of PPP2R2A remarkably suppressed cell proliferation, migration and invasion in ECA109 and TE1 cell lines. In fact, PPP2R2A refers to a large family of heterotrimeric Ser/Thr phosphatases and can be considered a tumor suppressor gene that regulates tumor growth. Studies have found that PPP2R2A suppresses gastric cancer cell proliferation, invasion, and epithelial-mesenchymal transition (EMT)[35]. TNP1, also known as transition protein 1, has been confirmed to be expressed in murine Leydig tumor cell lines[36]. In addition, a previous study reported that overexpression of ZNF350 in colon cancer cells significantly increased their proliferative and migratory abilities[37]. In summary, these results suggest that the DDRG signature might be exploited as a viable ESCC prognostic indicator. Additionally, we also explored the critical signature of DDRGs in hopes of providing evidence for immunotherapy. A previous study reported that DDR can reshape the tumor microenvironment[38]. In this study, we found that the infiltration levels of naive CD4 T cells, activated mast cells and monocytes were significantly higher in the low-risk group. Naive CD4 T cells are a type of lymphocyte, and previous studies have investigated their prognostic value in tumors[39, 40]. Yang et al.[41] and Hara et al.[42] reported that increased naive CD4 T cells can predict favorable survival in resectable NSCLC. Mast cells produce a unique set of antitumorigenic immune mediators, which has been proven in a tumor model[43]. Plotkin et al. revealed that human mast cells can release large amounts of GM-CSF[44]. GM-CSF has been demonstrated to suppress tumor cell proliferation and has been applied in clinical antitumor therapies[45]. Additionally, Fereydouni et al. presented evidence that mast cells can be polarized to regulate their hyperinflammatory and antitumor effects as potential target cells for precision immunotherapy[46]. Interestingly, activated NK cells were lower in the low-risk group. Activated NK cells are a phenotype of NK cells, and studies have demonstrated that a lower proportion of activated NK cells promotes the level of tumor infiltration to favor the formation of the immune microenvironment[47]. Immunosuppressive gene analysis revealed that the DDRG signature was positively correlated with BTLA, CTLA4, HAVCR2, IL10RB, PDCD1LG2, and TIGIT, particularly PDCD1LG2, which may provide clues for acting as a potential immune target to enhance antitumor effects in ESCC.

This study has several limitations. First, consensus clustering, the development of a prognostic model, and validation were performed based on data sources. The potential for selection bias is inevitable. Independent validation data from multiple centers and a larger sample size are required to validate these findings. Second, the number of DDRGs was limited, and additional DDRGs identified in new studies have not been included in the present study. Third, the correlation between the DDRG signature and immunotherapy response in ESCC was not evaluated directly because public database information from ESCC patients receiving immunotherapy was not available. Further studies will be performed to collect clinical and experimental data and investigate the mechanisms of DDR molecular subtypes and the DDR prognostic model on the prognosis of ESCC patients receiving immunotherapy.


In summary, our study comprehensively explored the predictive efficacy of the DDRG signature on the survival of ESCC patients. Additionally, immune-related analysis revealed that the DDRG signature could distinguish immune activity to screen appropriate patients for benefitting from immunotherapy. This is the first report of prognostic and immune analysis based on the DDR gene signature for ESCC, which might facilitate individual management in the era of accurate immunotherapy.

Availability of data and materials

The datasets analyzed during the current study are available in the GEO database ( (PERSISTENT ACCESSION NUMBER TO DATASETS: GSE53625) and TCGA database ( (PERSISTENT ACCESSION NUMBER TO DATASET: TCGA-ESCC). All data generated or analyzed during this study are included in this published article and its supplementary information files.


  1. Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65(2):87–108.

    Article  PubMed  Google Scholar 

  2. He Z, Ke Y. Precision screening for esophageal squamous cell carcinoma in China. Chin J Cancer Res. 2020;32(6):673–82.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Colvin H, Mizushima T, Eguchi H, Takiguchi S, Doki Y, Mori M. Gastroenterological surgery in Japan: the past, the present and the future. Ann Gastroenterol Surg. 2017;1(1):5–10.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Pectasides E. Immune checkpoint blockade in esophageal squamous cell carcinoma: is it ready for prime time? J Thorac Dis. 2018;10(3):1276–9.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Brozmanova J, Dudas A, Henriques JA. Repair of oxidative DNA damage: an important factor reducing cancer risk. Minireview Neoplasma. 2001;48(2):85–93.

    CAS  PubMed  Google Scholar 

  6. Gidron Y, Russ K, Tissarchondou H, Warner J. The relation between psychological factors and DNA-damage: a critical review. Biol Psychol. 2006;72(3):291–304.

    Article  PubMed  Google Scholar 

  7. Scarbrough PM, Weber RP, Iversen ES, Brhane Y, Amos CI, Kraft P, Hung RJ, Sellers TA, Witte JS, Pharoah P, et al. A Cross-cancer genetic association analysis of the DNA repair and DNA damage signaling pathways for lung, ovary, prostate, breast, and colorectal cancer. Cancer Epidemiol Biomark Prev. 2016;25(1):193–200.

    Article  CAS  Google Scholar 

  8. Gavande NS, VanderVere-Carozza PS, Hinshaw HD, Jalal SI, Sears CR, Pawelczak KS, Turchi JJ. DNA repair targeted therapy: the past or future of cancer treatment? Pharmacol Ther. 2016;160:65–83.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Roos WP, Thomas AD, Kaina B. DNA damage and the balance between survival and death in cancer biology. Nat Rev Cancer. 2016;16(1):20–33.

    Article  CAS  PubMed  Google Scholar 

  10. Lawrence MS, Stojanov P, Mermel CH, Robinson JT, Garraway LA, Golub TR, Meyerson M, Gabriel SB, Lander ES, Getz G. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature. 2014;505(7484):495–501.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Chae YK, Davis AA, Raparia K, Agte S, Pan A, Mohindra N, Villaflor V, Giles F. Association of tumor mutational burden with DNA repair mutations and response to anti-PD-1/PD-L1 therapy in Non-small-cell lung cancer. Clin Lung Cancer. 2019;20(2):88-96.e86.

    Article  CAS  PubMed  Google Scholar 

  12. Mouw KW, Goldberg MS, Konstantinopoulos PA, D’Andrea AD. DNA damage and repair biomarkers of immunotherapy response. Cancer Discov. 2017;7(7):675–93.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. McLaughlin M, Patin EC, Pedersen M, Wilkins A, Dillon MT, Melcher AA, Harrington KJ. Inflammatory microenvironment remodelling by tumour cells after radiotherapy. Nat Rev Cancer. 2020;20(4):203–17.

    Article  CAS  PubMed  Google Scholar 

  14. Song Y, Huang J, Liang D, Hu Y, Mao B, Li Q, Sun H, Yang Y, Zhang J, Zhang H, et al. DNA damage repair gene mutations are indicative of a favorable prognosis in colorectal cancer treated with immune checkpoint inhibitors. Front Oncol. 2020;10:549777.

    Article  PubMed  Google Scholar 

  15. Wang Z, Zhao J, Wang G, Zhang F, Zhang Z, Zhang F, Zhang Y, Dong H, Zhao X, Duan J, et al. Comutations in DNA damage response pathways serve as potential biomarkers for Immune checkpoint blockade. Cancer Res. 2018;78(22):6486–96.

    Article  CAS  PubMed  Google Scholar 

  16. Wang G, Zhou H, Tian L, Yan T, Han X, Chen P, Li H, Wang W, Xiao Z, Hou L, et al. A prognostic DNA damage repair genes signature and its impact on immune cell infiltration in glioma. Front Oncol. 2021;11:682932.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Chi H, Jiang P, Xu K, Zhao Y, Song B, Peng G, He B, Liu X, Xia Z, Tian G. A novel anoikis-related gene signature predicts prognosis in patients with head and neck squamous cell carcinoma and reveals immune infiltration. Front Genet. 2022;13:984273.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics. 2010;26(12):1572–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Chi H, Xie X, Yan Y, Peng G, Strohmer DF, Lai G, Zhao S, Xia Z, Tian G. Natural killer cell-related prognosis signature characterizes immune landscape and predicts prognosis of HNSCC. Front Immunol. 2022;13:1018685.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Peng G, Chi H, Gao X, Zhang J, Song G, Xie X, Su K, Song B, Yang J, Gu T, et al. Identification and validation of neurotrophic factor-related genes signature in HNSCC to predict survival and immune landscapes. Front Genet. 2022;13:1010044.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Kanehisa M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 2019;28(11):1947–51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe M. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res. 2021;49(D1):D545–51.

    Article  CAS  PubMed  Google Scholar 

  23. Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28(1):27–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Dong W, Wu X, Ma S, Wang Y, Nalin AP, Zhu Z, Zhang J, Benson DM, He K, Caligiuri MA, et al. The mechanism of Anti-PD-L1 antibody efficacy against PD-L1-negative tumors identifies NK cells expressing PD-L1 as a cytolytic effector. Cancer Discov. 2019;9(10):1422–37.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Jeggo PA, Lobrich M. How cancer cells hijack DNA double-strand break repair pathways to gain genomic instability. Biochem J. 2015;471(1):1–11.

    Article  CAS  PubMed  Google Scholar 

  26. Gillman R, Lopes Floro K, Wankell M, Hebbard L. The role of DNA damage and repair in liver cancer. Biochim Biophys Acta Rev Cancer. 2021;1875(1):188493.

    Article  CAS  PubMed  Google Scholar 

  27. Shao C, Wang Y, Pan M, Guo K, Molnar TF, Kocher F, Seeber A, Barr MP, Navarro A, Han J, et al. The DNA damage repair-related gene PKMYT1 is a potential biomarker in various malignancies. Transl Lung Cancer Res. 2021;10(12):4600–16.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Zhou H, Wu L, Yu L, Yang Y, Kong L, Liu S, Chen W, Li R. Identify a DNA damage repair gene signature for predicting prognosis and immunotherapy response in cervical squamous cell carcinoma. J Oncol. 2022;2022:8736575.

    PubMed  PubMed Central  Google Scholar 

  29. Xiao W, Tang P, Sui Z, Han Y, Zhao G, Wu X, Yang Y, Zhu N, Gong L, Yu Z, et al. Establishment of a risk model by integrating hypoxia genes in predicting prognosis of esophageal squamous cell carcinoma. Cancer Med. 2022;12(2):2117–33.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Zhu J, Zhao Y, Wu G, Zhang X, Chen Q, Yang B, Guo X, Ji S, Gu K. Ferroptosis-related lncRNA Signature correlates with the prognosis, tumor microenvironment, and therapeutic sensitivity of esophageal squamous cell carcinoma. Oxid Med Cell Longev. 2022;2022:7465880.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Wang LE, Gorlova OY, Ying J, Qiao Y, Weng SF, Lee AT, Gregersen PK, Spitz MR, Amos CI, Wei Q. Genome-wide association study reveals novel genetic determinants of DNA repair capacity in lung cancer. Cancer Res. 2013;73(1):256–64.

    Article  CAS  PubMed  Google Scholar 

  32. Li M, Chen R, Ji B, Fan C, Wang G, Yue C, Jin G. Role of ERCC5 polymorphisms in nonsmall cell lung cancer risk and responsiveness/toxicity to cisplatinbased chemotherapy in the Chinese population. Oncol Rep. 2021;45(3):1295–305.

    Article  CAS  PubMed  Google Scholar 

  33. Zheng X, Chen K, Liu X, Jiang G, Liu H. High expression of ERCC5 predicts a poor prognosis in hepatocellular carcinoma. Int J Clin Exp Pathol. 2018;11(7):3664–70.

    PubMed  PubMed Central  Google Scholar 

  34. Shao M, Jin B, Niu Y, Ye J, Lu D, Han B. Association of POLK polymorphisms with platinum-based chemotherapy response and severe toxicity in non-small cell lung cancer patients. Cell Biochem Biophys. 2014;70(2):1227–37.

    Article  CAS  PubMed  Google Scholar 

  35. Zhang M, Wang S, Yi A, Qiao Y. microRNA-665 is down-regulated in gastric cancer and inhibits proliferation, invasion, and EMT by targeting PPP2R2A. Cell Biochem Funct. 2020;38(4):409–18.

    Article  CAS  PubMed  Google Scholar 

  36. Zhang S, Zhang Y, Yang C, Zhang W, Ju Z, Wang X, Jiang Q, Sun Y, Huang J, Zhong J, et al. TNP1 functional SNPs in bta-miR-532 and bta-miR-204 Target sites are associated with semen quality traits in chinese holstein bulls. Biol Reprod. 2015;92(6):139.

    Article  PubMed  Google Scholar 

  37. Tanaka H, Kuwano Y, Nishikawa T, Rokutan K, Nishida K. ZNF350 promoter methylation accelerates colon cancer cell migration. Oncotarget. 2018;9(95):36750–69.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Li Q, Zhang P, Hu H, Huang H, Pan D, Mao G, Hu B. The DDR-related gene signature with cell cycle checkpoint function predicts prognosis, immune activity, and chemoradiotherapy response in lung adenocarcinoma. Respir Res. 2022;23(1):190.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. BouNasser Eddine F, Ramia E, Tosi G, Forlani G, Accolla RS. Tumor immunology meets… immunology: modified cancer cells as professional APC for priming naive tumor-specific CD4+ T cells. Oncoimmunology. 2017;6(11):e1356149.

    Article  Google Scholar 

  40. Ibrahim M, Scozzi D, Toth KA, Ponti D, Kreisel D, Menna C, De Falco E, D’Andrilli A, Rendina EA, Calogero A, et al. Naive CD4(+) T cells carrying a TLR2 agonist overcome TGF-beta-mediated tumor immune evasion. J Immunol. 2018;200(2):847–56.

    Article  CAS  PubMed  Google Scholar 

  41. Yang P, Ma J, Yang X, Li W. Peripheral CD4 + naive/memory ratio is an independent predictor of survival in non-small cell lung cancer. Oncotarget. 2017;8(48):83650–9.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Hara M, Matsuzaki Y, Shimizu T, Tomita M, Ayabe T, Enomoto Y, Onitsuka T. Preoperative peripheral naive/memory ratio and prognosis of nonsmall-cell lung cancer patients. Ann Thorac Cardiovasc Surg. 2007;13(6):384–90.

    PubMed  Google Scholar 

  43. Rodewald HR, Feyerabend TB. Widespread immunological functions of mast cells: fact or fiction? Immunity. 2012;37(1):13–24.

    Article  CAS  PubMed  Google Scholar 

  44. Plotkin JD, Elias MG, Fereydouni M, Daniels-Wells TR, Dellinger AL, Penichet ML, Kepley CL. Human mast cells from adipose tissue target and induce apoptosis of breast Cancer cells. Front Immunol. 2019;10:138.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Yan WL, Shen KY, Tien CY, Chen YA, Liu SJ. Recent progress in GM-CSF-based cancer immunotherapy. Immunotherapy. 2017;9(4):347–60.

    Article  CAS  PubMed  Google Scholar 

  46. Fereydouni M, Ahani E, Desai P, Motaghed M, Dellinger A, Metcalfe DD, Yin Y, Lee SH, Kafri T, Bhatt AP, et al. Human tumor targeted cytotoxic mast cells for cancer immunotherapy. Front Oncol. 2022;12:871390.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, Hoang CD, Diehn M, Alizadeh AA. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12(5):453–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


Not applicable.


This work was supported by the National Natural Science Foundation of China (No. 81872456), Natural Science Foundation of Hebei Province (No. H2020206583) and Natural Science Foundation of Hebei Province (No. H2022206459).

Author information

Authors and Affiliations



SCZ designed the study. DG and XYZ performed the analysis. DG, XYD, and WBS wrote the manuscript. DG and WNY performed the validation in the independent cohort. DG, JD, XYZ, and SCZ contributed to preparing the figures and tables. DG, WBS, and SCZ revised the manuscript. All authors reviewed the manuscript and approved the final version.

Corresponding author

Correspondence to Shuchai Zhu.

Ethics declarations

Ethics approval and consent to participate

This research was approved by the Ethics Committee of Fourth Hospital of Hebei Medical University.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

. Figure S1: Establishment of prognostic model in ESCC. (A) LASSO coefficients of the five prognostic DDRGs. (B) Identifying LASSO deviance profiles using cross-validation. (C) The forest plot shows prognostic DDRGs using univariate Cox regression analysis. (D) The five prognostic DDRGs and 50 related genes. ESCC, esophageal squamous cell carcinoma; DDRGs, DNA damage repair genes.

Additional file 2

. Figure S2: GSEA results of specific enrichment set. (A, B) The 10 significantly enriched GO terms in high and low-risk ESCC patients. (C, D) KEGG pathways of different risk groups. GSEA, gene set enrichment analysis; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; ESCC, esophageal squamous cell carcinoma

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guo, D., Zhang, X., Du, X. et al. A novel DNA damage repair gene-related prognostic model for evaluating the prognosis and tumor microenvironment infiltration of esophageal squamous cell carcinoma. BMC Med Genomics 16, 27 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Esophageal squamous cell carcinoma
  • DNA damage repair
  • Risk score
  • Prognosis
  • Immune