Characterization and clinical evaluation of microsatellite instability and loss of heterozygosity within tumor-related genes in colorectal cancer

Background Microsatellite instability (MSI) is a biomarker for better outcomes in colorectal cancer (CRC). However, this conclusion is controversial. In addition, MSs can be a useful marker for loss of heterozygosity (LOH) of genes, but this finding has not been well studied. Here, we aimed to clarify the predictive value of MSI/LOH within tumor-related genes in CRC. Methods We detected MSI/LOH of MSs in tumor-related genes and the Bethesda (B5) panel by STR scanning and cloning/sequencing. We further analyzed the relationship between MSI/LOH status and clinical features or outcomes by Pearson’s Chi-square test, Fisher’s exact test and the Kaplan–Meier method. Results The findings indicated that the MSI rates of B5 loci were all higher than those of loci in tumor-related genes. Interestingly, MSI/LOH of 2 loci in the B5 panel and 12 loci in tumor-related genes were associated with poorer outcomes, while MSI/LOH of the B5 panel failed to predict outcomes in CRC. MSI of BAT25, MSI/LOH of BAT26 and MSI of the B5 panel showed closer relationships with mucinous carcinoma. In addition, LOH-H of the B5 panel was associated with increased lymphatic metastasis. Conclusions In summary, MSI/LOH of certain loci or the whole panel of B5 is related to clinical features, and several loci within tumor-related genes showed prognostic value in the outcomes of CRC. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-021-01051-5.


Background
Colorectal cancer (CRC) is one of the most common cancers in the world [1]. In China, CRC is one of the five leading causes of cancer-related death and one of the two most common cancers in both men and women according to data from the National Central Cancer Registry of China (NCCR) [2]. CRC often exhibits significant heterogeneity in both prognosis and chemotherapeutic response, despite similar histological features and tumor stage [3].
The carcinogenesis of colorectal cancer involves various potential pathways, including chromosomal instability (CIN) and microsatellite instability (MSI). CIN is detected in up to 80% of CRCs and may be accompanied by loss of heterozygosity (LOH) and chromosomal rearrangement [4]. MSI is known as a hypermutable phenotype resulting from the loss or dysfunction of the mismatch repair (MMR) system, which detects and repairs mismatches that occur during DNA replication [5]. It has been reported that approximately 15% of CRCs carry MSI in Western countries [6], whereas approximately 14.3% of CRCs in China were identified as MSIpositive [7]. Thus, it is well accepted that MSI status is associated with CRC. The Bethesda panel (B5 panel) has been recommended by the American National Cancer Institute for testing MSI [8,9]. According to MSI determination using the B5 panel, CRCs with MSI exhibit distinctive features, including a tendency to arise in the proximal colon, lymphocytic infiltration, and a poorly differentiated, mucinous or signet ring appearance, and they have a better prognosis than tumors without MSI due to their differential susceptibility to chemotherapeutics [6,10]. Many studies have indicated that patients with high levels of MSI (MSI-H) exhibit a better antitumor immune response and improved prognosis compared to those with low levels of MSI (MSI-L) or those who are microsatellite stable (MSS) [11]. Combined MSI and elevated microsatellite alterations at selected tetranucleotide repeats (EMAST) might be more suitable for treatment with immunotherapy in colorectal cancer [12]. However, this conclusion is controversial [13]. On the other hand, LOH analysis identifies allelic imbalances, which reflect gains and losses of chromosomal regions. It is known that the severity of LOH differs among tumors. Some tumors have LOH at many loci in various chromosomes, whereas others have less frequent LOH [14]. A few studies have indicated that different LOH mutation frequencies of loci might be related to the biological behavior of CRC [14,15]. However, this conclusion is still under debate. Herein, we analyzed data from 440 CRC patients in China using the B5 panel. As expected, they were quite sensitive to detecting MSI, but the MSI and LOH status of the B5 panel had no prognostic value or predictive significance for CRC. Notably, MSI/LOH mutations of the loci or panels recommended for the B5 panel and tumor-related genes correlated with the clinical features of CRC and could be used for determining treatments of individual patients. Thus, the development of novel robust biomarkers for the CRC population may be beneficial for the prognosis and prediction of chemotherapeutic responses.
Additional new panels for MSI tests have recently been developed. Ronald J Hause et al. demonstrated that MSI status was scattered across the human genome in 18 cancer types, including CRC, revealing that MSI's prognostic significance [16]. However, the number of MS loci was too large to be used in practice. It has been reported that several critical genes, such as TP53, APC inactivation, KRAS and BRAF mutations, MYC amplification, and other tumor-related genes, are altered in CRC. These molecular events lead to dysregulation of cell growth, proliferation, survival, apoptosis, and invasion, which are involved in tumorigenesis and tumor progression [17]. The prevalence and clinical significance of KRAS, BRAF, NRAS, and PIK3CA mutations have been documented in the Chinese CRC population [18]. However, the data are quite limited, and MSI status within these tumor-related genes has not yet been fully explored. Therefore, we hypothesized that MSI/LOH within these genes would be appropriate markers for clinical pathological staging, prognosis and predicting response to chemotherapy in CRC patients. In the present study, we investigated the MSI/LOH profile in tumor-related genes and illuminated the relationship between MSI/LOH status and clinicopathological characteristics in Chinese CRC patients.

Cohort selection and DNA extraction
This study was performed on 440 pairs of CRC and adjacent normal tissues collected from the local Institutional Review Broad of Beijing Friendship Hospital. Among the 440 CRCs, 256 were assigned to training (2006 to 2014), and all 440 CRCs were defined as validation sets (2005 to 2014). Patients with colorectal cancers were stage I-IV according to the TNM system classification of the American Joint Committee on Cancer. Informed consent was obtained from all individuals, and the principal inclusion criteria were as follows: histologically confirmed papillary/tubular adenocarcinoma, signet ring carcinoma and mucinous carcinoma of the colon or rectum. Patients were followed until their last contact or death. Vital status and cause of death were obtained from medical records, tumor registry correspondence, or death confirmation.
Clinicopathological data were obtained from the medical record archive. The clinical and histopathological information of 256 patients is shown in Additional file 1: Table S1. In brief, the mean age was 67.39 years (range 49-86 years.), while 57.48% were male and 42.52% were female, but the sex information of two patients was missing. Moreover, 34.51% (n = 88) and 22.35% (n = 57) of patients exhibited a history of smoking and drinking, respectively. Overall, 131 patients (51.17%) received adjuvant treatment; stage II and III tumors represented 57.73% and 42.27% of the cases, respectively. Regarding tumor location, 54.69% (n = 140) and 45.31% (n = 116) of tumors were located within the colon and rectum, respectively. The 5-year overall survival (OS) rate and 5-year progession-free survival (PFS) rate were 81.64% (n = 209) and 76.17% (n = 195), respectively. Among the patients studied, forty-seven died during data collection.
Genomic DNA was extracted from 880 samples (440 pairs) using a standard phenol-chloroform method [19]. DNA quality was analyzed by a microvolume spectrophotometer (Thermo Scientific NanoDrop 2000, Waltham, MA, USA) and agarose gel electrophoresis.

Microsatellite instability and loss of heterozygosity
Microsatellite status in CRC was determined by PCR amplification using primer pairs for 61 microsatellite loci. The 5′-end of the forward primer for each locus was tagged with a FAM, HEX, or TAMRA fluorescent marker. PCR amplification was performed using the optimized annealing temperature for each pair of primers. PCR products were evaluated on 2% agarose gels prior to STR scanning.
PCR products of the microsatellites were visualized through capillary electrophoresis on an ABI-3730XL DNA Analyzer system (PE Biosystems, Carlsbad, CA, USA). The peak height of the wave for each specimen was determined using GeneMarker version 1.75. MSI was also assessed by 5 Bethesda loci, including BAT25, BAT26, D2S123, D5S346, and D17S250. Using capillary array electrophoresis, MSI may be demonstrated using two main features: de novo alleles that appear as new peaks (i.e., peaks that did not exist in the normal tissue genotype) and slipped pre-existing alleles for the few base pairs [20,21]. Samples that do not exhibit MSI were defined as MSS. In addition to MSI, we analyzed LOH mutation, another mutant phenomenon distinct from MSI involving a partial (> 35%) to complete signal loss of one heterozygote allele [22,23]. Samples that did not exhibit LOH were defined as non-LOH. Exemplary images of MSI and LOH for BAT-25/TP53-1 loci are shown in Additional file 2: Fig. S1.

Statistical analysis
Statistical analysis was performed using IBM SPSS ® Statistics 16.0 package software (SPSS Inc.). Pearson's Chisquare or Fisher's exact test was performed to analyze the association between MSI/LOH and tumor pathological types, tumor stages, lymphatic metastasis, infiltration depth, tumor differentiation degree, and tumor recurrence; to compare MSI mutation profile of tumors grouped by the B5 MSI status; to compare MSI/LOH occurrence in different gene types, locations, and repeat motifs; and to compare the incidence of MSI between tumor-related genes. The Kaplan-Meier method was used to estimate OS and PFS outcomes in 256 CRC patients, stage II patients, stage III patients and chemotherapy patients. A p value < 0.05 was considered statistically significant. The * symbol indicates p < 0.05, ** indicates p < 0.01, and *** indicates p < 0.001.
Colorectal carcinomas with high-frequency microsatellite instability (MSI-H) accounted for 15% of all colorectal cancers, including 12% of sporadic cases and 3% of cancers associated with Lynch syndrome. Using the B5 panel, we classified tumors as MSI-H, MSI-L or MSS. Colorectal cancers with MSI-H accounted for 10.94% of all cases, which was comparable to data in the literature. Moreover, MSI-H tumors defined by the B5 panel were more prone to mutations in MS loci of tumor-related genes (Additional file 3: Fig. S2).
The prognostic value and prediction of the response to chemotherapy of MSI/LOH in tumor-related genes MSI can provide rich information for prognosis and evaluation of the chemotherapy response in cancer patients [24,25]. The overall survival (OS) of patients with MSI-H also tends to be longer than in patients with MSS/MSI-L (63.5 months versus 60.0 months, p = 0.013) [20]. In the present study, we explored the relationship of MSI/LOH of 32 sensitive loci and the outcomes of CRCs only in the training group (n = 256) due to a lack of survival information of the second batch of samples.
According to MSI status of the B5 panel, outcomes were not significantly different between MSI-H and MSI-L + MSS CRC patients for all stages combined ( Fig. 2A, B), stage II (Fig. 2C, D), stage III (Fig. 2E, F) or adjuvant chemotherapy (Fig. 2G, H). Similarly, there was no significant difference between LOH-H (at least two of the B5 loci showed LOH) and LOH-L + LOH-MSS patients for all stages combined (Fig. 3A, B), stage II (Fig. 3C, D), stage III (Fig. 3E, F) or adjuvant chemotherapy (Fig. 3G, H) patients. However, we found that the MSI/LOH status of BAT25 and D17S250 in the B5 panel and 12 loci in tumor-related genes were sensitive markers for outcome prediction in CRC patients (Table 1 and  (Table 1 and Fig. 4D).
We also examined the association of MSI/LOH in tumor-related genes with the response to adjuvant chemotherapy. In the adjuvant chemotherapy group (n = 132), patients with MSI in D17S250 (p = 0.01) and MCC-10 (p = 0.001) and LOH in BAT-25 (p = 0.048) presented a poorer outcome in 5-year OS (Fig. 4G). Meanwhile,  (Table 1 and Fig. 4H).
We further performed the Cox regression survival analysis in the entire group of patients (n = 256) (

Association of the MSI/LOH profile with CRC clinical features
Clinical features, such as TNM (tumor-node-metastasis) stage and pathological type are usually important prognostic factors for patients with colorectal cancer [26]. Analysis of the association of the MSI/LOH profile with CRC clinical features was performed in the training cohort (n = 256) and was clarified in the validation cohort (n = 440). Here, we showed that the numbers of patients with mucinous carcinoma who had MSI in BAT25 (p = 0.005), MSI/LOH in BAT26 (p = 0.004) or MSI-H in the B5 panel (p = 0.012) were significantly higher than those in adenocarcinoma (Additional file 1:   Tables S5-S6). These results illustrated that, compared to loci in tumor-related genes, the MSI/LOH of certain loci or the whole panel of B5 cells had a closer relationship to the pathological type of CRC. Next, we explored the MSI/LOH profile and its association with other clinicopathological features. Although the MSI/LOH of several loci was remarkably related to TNM stage, lymphatic metastasis, infiltration depth, differentiation degree and recurrence in the training group, they all failed to be confirmed in the validation group (Additional file 1: Tables S7-S14). With respect to the B5 panel, LOH-H patients exhibited increased lymphatic metastasis compared to LOH-L + non-LOH CRCs in both training (p = 0.05) and validation (p = 0.04) sets (Additional file 1: Tables S9-S10).
To investigate the mutation patterns of tumor-related genes in human CRCs, we divided mutations into two patterns: MSI and LOH. Among 1126 mutation events, the rates of MSI and LOH were 19.27% (n = 217) and 80.73% (n = 909), respectively (Additional file 4: Fig. S3A). We found that LOH was the most common mutation type in tumor-related genes (Additional file 4: Fig. S3B). Of the 61 MS loci, we found mutations in 54 MS loci, and most (40 loci) of them exhibited both MSI and LOH patterns (Additional file 4: Fig. S3C). There were 11 loci exhibiting the LOH pattern alone and 3 loci only showing the MSI pattern. Statistical analysis indicated that the MSI frequency was similar among the four types of genes. MSI in TS genes (1.50%, 100/26 × 256) was similar to that in DNAR genes (1.50%, 23/6 × 263), MMR genes (1.30%, 30/9 × 263), and oncogenes (1.25%, 64/20 × 263). However, the proportion of MSI in TS genes was much lower than in oncogenes (p < 0.01) (Additional file 4: Fig.  S3D). When focusing on the locations of MS, we found that introns and noncoding regions harbored two mutation types, while the 3′UTR only had LOH mutations, and exons had MSI mutations. The proportion of LOH patterns in introns represented 80.83% (856/1059) of all mutation events. The proportions of MSI and LOH in noncoding regions were similar (Additional file 4: Fig.  S3E).
We further analyzed mutation patterns based on the number of repeat units, especially in introns, the types of repeat units, and the length of repeat units. There was no correlation among these subgroups (Additional file 5: Fig.  S4), indicating that mutation patterns were not affected by repeat units.

Mutational profile of MS in human CRCs
Given that the B5 panel has been frequently applied in clinical practice and that the MMR system is of pivotal importance for the occurrence of MSI, we analyzed whether the MSI of tumor-related genes we studied was relevant to the status of B5 or MMR. CRC samples were divided into B5-MSI and B5-MSS or MMR-deficient (MMR-d) and MMR-proficient (MMR-p) groups according to their MSI status of B5 or MMR.
The data showed that the MSI frequency of 16 tumorrelated genes (84.2%, 16/19) we detected was significantly higher in the B5-MSI group than in the B5-MSS group (Additional file 1: Table S15). This result indicates that the B5 panel is a high-efficiency criterion for assessing the integral MSI status of the genome.
Similarly, except for four MMR genes, MSH2, MLH1, MSH6 and PMS2, the MSI frequency of the majority of tumor-related genes (80%, 12/15) we detected was remarkably higher in MMR-d tumors than in MMR-p tumors (Additional file 1: Table S16). This finding is in accordance with the statement that the MMR system plays a vital role in the occurrence of MSI.

The MSI/LOH spectrum in CRC patients
An increased number of mutations was detected in CRCs [27], suggesting that the mutation spectrum in CRCs was very complicated. In our study, 54.17% (26/48) of CRCs harbored MSI events within one gene, while 6.25% (3/48) harbored MSI events in two genes simultaneously. Of 48 MSI patients, the number of MSI events detected in  Tables S19-20). These results suggest a complicated spectrum of mutations in CRC patients.
We further found that both the gene numbers and the MSI loci numbers in non-adenocarcinoma patients were higher than those in adenocarcinoma patients (p = 0.002; p = 0.002, respectively) (Additional file 6: Fig. S5A-B). Moreover, both MSI gene number and MSI locus number in colon patients were higher than those in rectal patients (p = 0.006 and p = 0.007, respectively) (Additional file 6: Fig. S5C-D). However, no significant differences in the gene number or the locus number were found between different group of sex (Additional file 6: Fig. S5E-F) and differentiated degree (Additional file 6: Fig. S5G-H). These findings suggest that the MSI frequency of tumorrelated genes in colorectal cancer is associated with pathological type and tumor location.

Discussion
MSI is an important feature observed in many tumor types, especially in sporadic CRC patients, with prognostic and therapeutic value and has been used in the clinic [11,28]. It is associated with pathological characteristics and cancer outcomes and is used to predict response to adjuvant chemotherapy [29]. In addition, the evolution of genetic instability in colon cancer may involve chromosomal instability (CIN), which may be accompanied by a loss of heterozygosity (LOH). CIN-high CRCs showed significantly poorer outcomes compared to CIN-low CRC. Therefore, MSI and LOH status have been considered to be valuable and independent prognostic markers in CRC patients [30]. Although risk scores based on clinical and pathological parameters have been developed to predict outcomes, the existing prognostic markers are unlikely to be sufficient for clinical decisions and are not interpreted well across institutions [31]. In our study, as evaluated with the B5 panel, MSI and MSS patients showed similar outcomes in 5-year OS and 5-year PFS (Fig. 2G-L), and there were no differences between LOH and non-LOH patients in 5-year OS or 5-year PFS. This result suggests that, occasionally, the MSI/LOH detected by the B5 panel may not be a sufficient biomarker for predicting outcomes in Chinese CRC patients. Therefore, it is urgent to screen more practicable markers for colorectal cancer.
Recently, Jun Yu et al. identified seven significantly mutated genes in Asian CRC, a mutation signature that predicted survival outcomes [27]. We hypothesize that MSI/LOH in tumor-related genes may serve as complementary markers to predict the outcome of CRC. Importantly, several loci in the B5 panel and tumor-related genes we detected showed remarkable prognostic value for all CRC patients, as well as for stage II and stage III CRC patients individually.
Although prediction of the chemotherapy response by MSI remains controversial [12,13], some studies have shown that MSI CRCs are particularly responsive to immunotherapy, such as anti-PD-1 blockade [32]. In the present study, we found that patients with LOH in BAT25 or MSI in MCC-10 did not benefit from adjuvant chemotherapy (Table 1). Therefore, the MSI/LOH status of these 2 loci may be useful, convenient, and applicable for predicting the response to chemotherapy in CRC patients.
Notably, MS loci that exhibited prognostic value were located in the MCC, MSH2, Pinch5, Mgmt, MLH1, APC, BRAF and P21 genes, which were reported to be involved in CRC progression [33,34]. This study may also provide a foundation for further investigation of the mechanisms underlying the functional involvement of these MSI loci in the development of CRC.
A strong correlation has been suggested between CRC clinicopathological features and MSI status. For example, the prevalence of CRC with microsatellites is different among disease stages, with 15% in stages II and III, which is more common in stage II [12]. MSI events may help determine the degree of tumor malignancy. Moreover, MSI tumors share similar histomorphology, regardless of their respective pathogenesis, and frequently had a mucinous phenotype [35]. Uniformly, our data also revealed a higher frequency of MSI in mucinous carcinoma compared to adenocarcinoma in certain loci or the whole panel of B5. In the present study, the MSI-H status of the B5 panel was related to mucinous carcinoma (p = 0.012). Surprisingly, MSI of BAT25 or MSI/LOH of BAT26, which belongs to the B5 panel, also showed a sensitive correlation with mucinous carcinoma in both the training and validation sets of CRC. These results indicate that patients with MSI in certain loci or the whole panel of B5 tend to develop mucinous carcinoma rather than adenocarcinoma. In addition, the result that LOH-H of the B5 panel was related to increased lymphatic metastasis indicated that, except for MSI-H, LOH-H status is also a potential marker for CRC features.
For the MSI/LOH profile, the results showed that the MSI mutation percentages of B5 were very high. These findings indicated that the B5 panel was meaningful for the study of colorectal cancer. The LOH frequency of the top 3 most frequent MS loci (TP53, APC-6 and Nup88-3) in the tumor-related genes was similar to that of the dinucleotide loci (D5S346 and D17S250) of B5. Therefore, MS loci in tumor-related genes may play an important role in the study of CRC.
In the present study, we generated MSI/LOH profiles in 19 tumor-related genes that were prone to alteration and might be involved in CRC tumorigenesis and progression [36][37][38][39][40]. Selected as one hot locus in the BRAF gene, BRAF-9 was the most frequent locus that was prone to mutation in CRC patients (5.08%, 13/256), and TP53-1 was the most frequently mutated gene in CRC patients (26.95%, 69/256). In our previous study, we examined MS (TP53ALU) status in intron 1 and mutations in all exons of the TP53 gene. Additionally, we studied the association between TP53-exon mutation and TP53ALU alterations. The prevalence of TP53-exon mutations was significantly higher in TP53ALU-LOH tumors than in TP53ALUnon-LOH tumors (p = 0.003) (Additional file 7: Fig. S6), suggesting that the TP53-exon is more likely to mutate when the MS of TP53-intron is in the status of LOH. However, no correlation was found between the TP53exon mutation and TP53ALU-MSI status (Additional file 7: Fig. S6) (unpublished results). This finding indicates that the LOH of MS in the TP53 intron seems to be a sensitive marker for the mutation status of TP53 exons, which always play a crucial role in CRC tumorigenesis.
In addition, our study showed that MS in TS genes was more prone to mutation than MS located in MMR genes or oncogenes, suggesting inactivation of tumor suppressor genes in CRC, as demonstrated in previous reports [41]. We also found a lower frequency of MSI events in certain genes, such as MYC, MDM2, BBC3, and KRAS. Notably, there was a lower occurrence of MSI in KRAS genes, which are often mutated in CRC. A larger cohort of CRC patients may validate this phenomenon.
MSs are abundant in both noncoding and coding regions in mammalian genomes [42]. MS mutations occurring in coding regions, introns, or untranslated regions may positively or negatively influence gene expression or protein function by interrupting gene transcription or splicing [43]. We observed that the MSI/ LOH frequency in introns was higher than in other locations. The prognostic and predicted panels of MSI/LOH were primarily located in introns, suggesting that MSs in introns may be prone to alter and relevant to the clinicopathological features of CRC. Moreover, we also found one MSI event in exon 2 of MYC with (CAG) 5 repeats, and new alleles emerged (174/174 to 165/174). Although only one MSI event was found in the exon of the MYC gene in one sample, this MSI may play a pivotal role in CRC, as reported by Jason B et al. [44]. In addition, we identified two loci with LOH in the 3′UTR of the MDM2 gene. Mutations within the 3′UTR might contribute to alterations in the recognition sites of microRNAs or RNA-binding proteins, affecting gene expression. Importantly, 88.46% (69/78) of mutation events at TP53-1 (AAAAT) 8 were LOH, which may be a key event in the pathogenesis of CRC involved in the "second hit" (mutation and subsequent LOH) process [28].
The MSI status of the B5 panel and the expression of MMR genes are frequently-used criteria to determine the MSI status of CRC. In the present study, the MSI frequency of tumor-related genes, except BBC3 and MYC, was higher in B5-MSI tumors than in B5-MSS tumors, suggesting that the B5 panel is a powerful tool for defining the MSI status of CRC. Similarly, the MSI frequency of 80% of the tumor-related genes we detected was significantly higher in MMR-MSI tumors than in MMR-MSS tumors. These results are in agreement with a previous report that higher mutation loads were frequently found in tumors with mismatch repair deficiency [5].
There are several limitations in our study. As a retrospective study, drawing more convincing conclusions was unavoidably limited. Due to the restrictions of medical records and short follow-up time, we insufficiently collected data regarding the treatment and survival information from the patients we recruited in this study. Moreover, to identify potential risk factors for the prognosis of CRC patients, multivariate Cox regression analyses were conducted. These analyses showed that tumor recurrence was a significant risk factor for the prognosis of CRC patients (RR 9.379, 95% CI 4.522-19.453, p < 0.001). In the experimental group, several loci were related to recurrence. However, the validation group did not confirm these loci. In addition, the 61 MS loci we selected from 19 genes were predetermined based on the PCR amplification efficacy, and not all MS loci in these tumorrelated genes were included; thus, other important loci in these genes are potentially missing from our findings.

Conclusions
Herein, we described the MSI/LOH profile of 19 tumorrelated genes and performed analysis to identify their clinical correlations and significance. Most importantly, we found several prognostic loci in the B5 panel and tumor-related genes that predicted the response to chemotherapy in Chinese CRC patients. Two MSI/LOH loci were associated with the pathological type of CRC. Our study offers a landscape of MS in the 19 tumorrelated genes in Chinese CRC patients and provides significant implications for clinical application.