- Research article
- Open Access
- Open Peer Review
Assessing the activity of nonsense-mediated mRNA decay in lung cancer
BMC Medical Genomicsvolume 10, Article number: 55 (2017)
Inhibition of nonsense-mediated mRNA decay (NMD) in tumor cells can suppress tumor growth through expressing new antigens whose mRNAs otherwise are degraded by NMD. Thus NMD inhibition is a promising approach for developing cancer therapies. Apparently, the success of this approach relies on the basal NMD activity in cancer cells. If NMD is already strongly inhibited in tumors, the approach would not work. Therefore, it is crucial to assess NMD activity in cancers to forecast the efficacy of NMD-inhibition based therapy.
Here we develop three metrics using RNA-seq data to measure NMD activity, and apply them to a dataset consisting of 72 lung cancer (adenocarcinoma) patients.
We show that these metrics have good correlations, and that the NMD activities in adenocarcinoma samples vary among patients: some cancerous samples show significantly stronger NMD activities than the normal tissues while some others show the opposite pattern. The variation of NMD activities among these samples may be partly explained by the varying expression of NMD effectors.
In sum, NMD activity varies among lung cancerous samples, which forecasts varying efficacies of NMD-inhibition based therapy. The developed metrics can be further used in other cancer types to assess NMD activity.
Lung cancer is the leading cause of cancer-related deaths : each year over a million of patients die of it, and millions of new cases are diagnosed. Such high incidence and mortality rates urge developing new effective treatments.
Recently, a new promising approach to treat cancers is to express new antigens in tumor cells through inhibiting nonsense-mediated mRNA decay (NMD) pathway [2, 3]. The mRNAs of these new antigens (owing to tumorous mutations) are normally degraded by NMD and thus invisible to immune systems. NMD is a cellular surveillance pathway for degrading mRNAs harboring premature termination codons (PTCs) . When NMD is inhibited, these antigens can be expressed and trigger immune rejection of tumor cells. This method successfully suppressed tumor growth in mice implanted with a colon carcinoma cell line CT26 . However, its efficacies in clinics and other types of tumors have not been evaluated so far. Particularly, the success of this method relies on the NMD activity of tumor cells. In another word, if tumor cells already have very low NMD activity, inhibiting NMD would not express substantial amount of new antigens and the method would not work. Therefore, it is necessary to evaluate NMD activities in different tumor types for potential application of NMD-inhibition-based therapies.
It has been reported that tumor cells may be subject to NMD inhibition. On one hand, the tumor microenvironment may inhibit NMD . For example, the tumor cells often undergo cellular stresses, such as amino acid deprivation, hypoxia, and reactive oxygen species (ROS) generation. These stresses can cause the phosphorylation of the translation initiation factor eIF2α (short for α subunit of eukaryotic initiation factor 2), which in turn suppresses NMD [5,6,7]. On the other hand, the mutations in NMD effectors may inactivate NMD. For example, NMD effector upf1 is frequently mutated in pancreatic adenosquamous carcinoma . If NMD is strongly inhibited in cancers, then further inhibition of NMD would not express many new antigens and in turn no strong immune reactions.
In this study, we develop three metrics to measure NMD activities and use them to assess NMD activities in the samples of lung adenocarcinoma -- the most common histological type of lung cancers.
We downloaded RNA-sequencing reads of lung adenocarcinoma patients from the NCBI Gene Expression Omnibus (GEO) database (accession number GSE40419) . Only the data of 72 individuals with both tumor and adjacent normal tissues (i.e., 144 samples) were extracted and used in the study. The ages of the patients vary from 38 to 82 years old.
Processing of RNA-seq data
Raw fastQ-formatted sequence files were mapped onto human reference genome (hg19) by using Tophat v2.0.8b , with annotated transcripts from Ensembl 71  as a guide for mapping (using the option -G). After mapping, the expression of genes was estimated using Cufflinks v2.1.1  and expressed as FPKM (Fragments per kilobase of transcript per million mapped reads). Extremely low expressed genes (less than ten reads in half or more of 144 samples) were excluded. We then normalized the data using the 75% percentile of each sample.
Afterwards, we applied samtools v1.1  to identify candidate variants that exist in both tumor and normal samples for each individual by feeding both mapped reads files. To reduce the chance of regarding sequencing errors as single-nucleotide variations (SNVs), we extracted SNVs with the following criteria: 1) ≥ 5 reads covering a site in both tissues, and 2) both reference and variant alleles were supported by mapped reads. SNPeff  was then used to evaluate the predicted effect of each variant based on NCBI Refseq annotation. The output contained information of whether a variant can introduce PTCs and trigger NMD.
Identifying NMD sensitive and insensitive genes
We compiled NMD-affected genes from four studies [15,16,17,18] in order to reliably define NMD target and non-target genes. Genes that are not included or not expressed in any of the four studies were excluded to avoid background biases. Specifically, we required that selected genes: i) had probe information in the two array-based studies [15, 18]; ii) met Hidenori Tani et al. standards , and iii) had at least one transcript isoform with ≥1 FPKM upon UPF1 knockdown in reference . The filtering resulted in 8319 genes.
Then genes were classified into NMD targets if they met either of the criteria: i) ≥ 2-fold upregulation upon Upf1 knockdown according to references [15, 16, 18]; ii) having at least one transcript isoform upregulated ≥ 3-fold upon Upf1 knockdown and expressed ≥ 5 FPKM according to reference . Finally, we obtained 817, 82, 37, and 13 target genes, depending on the number of supporting studies, 1, 2, 3, and 4 studies, respectively. The other genes that have no or marginal up-regulation (i.e., < 1.5-folds up-regulated in [15, 18], and < 2-folds up-regulated in ) and not stabilized according to  were classified as NMD non-target genes.
Identifying NMD-specific exon skipping events
Theoretically, any alternative splicing events introducing PTCs may trigger NMD. For simplicity, here we considered only exon-skipping events. We also required that the upstream and downstream exons of a focused exon are not subject to alternative splicing to ensure that an NMD isoform is generated by alternatively splicing of the focused exon only. In this way, we obtained 776 exon-skipping events that may trigger NMD according to the 50/55-bp rule .
When calculating the expression levels of NMD-inducing and NMD-free splicing isoforms, we only use the following mapped reads to ensure accuracy: 1) mapping quality = 50; 2) covering at least 6 nts on each of the joined exons, and 3) no mismatches or indels within the 12 nts near exon-exon junctions. Then the expression levels of different isoforms were calculated by counting supporting reads. For metric design, only splicing events with more than ten supporting reads in both tumor and normal samples of a patient were used.
We use the following three metrics to measure NMD activity in a biological sample: (1) the mRNA expression level of NMD target genes, (2) the usage (i.e., percentage) of NMD-inducing splicing isoform in NMD target genes which have both NMD-inducing and NMD-free isoforms, and (3) the abundance ratio of mRNAs derived from the NMD-inducing allele to the NMD-free one of the same gene. We name these three metrics as R mRNA , R isoform , and R allele , respectively. In principle, if NMD activity is strong in a sample, these metrics will have small values, because the NMD-inducing forms are more effectively degraded. To calculate these metrics, we collected RNA-seq data from a large-scale study  which produced data for 72 patients with each patient having both tumor and adjacent normal tissue samples sequenced (Additional file 1: Table S1).
We first describe the procedure to calculate the metric R mRNA , i.e., the mRNA abundances of NMD target genes. The first step was to identify NMD target genes. For that, we collected NMD target genes from four studies [15,16,17,18]. These studies measured gene expression changes after inhibiting NMD. We classify all genes whose expression was upregulated upon NMD inhibition as NMD targets. Note the targets by this approach may include direct as well as indirect NMD targets, but our purpose is to identify genes which can be used as an indicator of NMD activity, so any genes with upregulated expression after NMD inhibition are informative. In total, 951 genes are classified as NMD targets by at least one study, and 50 genes are supported by at least three studies (see Methods section). For accuracy, we use these 50 genes as our NMD target gene set (Additional file 2). To eliminate gene expression variation across samples due to systematic bias, we normalize the expression of NMD targets by dividing it with the median expression value of the 4074 non-target genes (Additional file 2; also see Methods). Then R mRNA is calculated for each gene using the following Eq. (1):
where mE NMD is the mRNA expression of an NMD target gene and median_mE nonNMD is the median value of all 4074 non-target genes from the same sample. To infer NMD activity in tumor relative to in a normal tissue, R mRNA (tumor)/R mRNA (normal) is calculated, i.e., the ratio of R mRNA between a tumor and the corresponding normal tissue. As shown in Fig. 1, the ratio R mRNA (tumor)/R mRNA (normal) varies both among genes and among patients, and in a few patients it deviates from unity significantly (Fig. 1; Additional file 1: Table S2). This result suggests that most tumor samples have NMD activity comparable to their normal baselines and that some tumor samples experienced dramatic changes in NMD activity.
Next, we calculated R isoform , the usage of NMD-inducing alternative splicing (AS) isoforms in a given gene, as follows:
where sE NMD and sE nonNMD are the abundances of NMD-inducing and NMD-free splicing isoforms, respectively. Compared to the metric R mRNA , R isoform is supposed to be more sensitive for two reasons: (1) it can detect NMD activity changes when the changes only affect the relative abundance of NMD-inducing splicing isoform but not the total mRNA abundance of a gene; (2) R isoform uses the abundance of NMD-free isoforms of the same gene as a normalizing factor which is better than using the expression of other genes as this factor (as in R mRNA ), because it is possible that the expression dynamics across samples may vary among genes. For calculating R isoform , we identified 776 AS events from 734 genes. Five hundred and twenty-eight and 248 cases trigger NMD when the alternative exon is included and excluded, respectively (see Methods for details). Among them, 14 AS events occur in all the samples (Additional file 3). To calculate R isoform in each sample, we use either all informative AS events occurring in the sample or only the 14 shared AS events. Similar to calculating R mRNA , we use the ratio of R isoform (tumor) to R isoform (normal) to account for the baseline difference among patients. As shown in Fig. 2 and Additional file 1: Table S3, when using all informative AS events in each sample, the ratio R isoform (tumor)/R isoform (normal) significantly deviates from unity in dozens of patients. The number of significant deviations is much larger than that when using R mRNA , verifying the inference of higher sensitivity by R isoform . When using the 14 shared AS events, we basically observe the same pattern (Additional file 1: Figure S1, Spearman’s Rho = 0.6127131, P = 1.063e-08). Although the metric R isoform is more sensitive than R mRNA , the ratios of tumor to normal samples from the two metrics are positively correlated (Additional file 1: Figure S2A, Rho = 0.3690621, P = 0.001422). The relationship remains when using the 14 shared events only (Additional file 1: Figure S2B, Rho = 0.2581838, P = 0.02882).
Finally, we calculate R allele as follows:
where aE NMD and aE nonNMD are the abundances of mRNAs derived from the NMD-inducing and NMD-free alleles of the same gene, respectively. For this analysis, all detected heterozygous nonsense mutations are included (Additional file 4). And again, the ratio of R allele (tumor) to R allele (normal) is calculated for each patient to infer NMD activity change. This approach has been used in multiple studies to check NMD efficiency [20,21,22]. However, in our analysis we do not detect significant difference between tumor and normal samples for any patient, probably due to a limited number of sites (Fig. 3; Additional file 1: Table S4). Nevertheless, a positive correlation between the median ratios of R allele (tumor)/R allele (normal) and R mRNA (tumor)/R mRNA (normal) across patients is observed (Additional file 1: Figure S3A, Rho = 0.2134399, P = 0.07183). And similarly, a slightly better correlation between (tumor)/R allele (normal) and R isoform (tumor)/R isoform (normal) is observed (Additional file 1: Figure S3B, Rho = 0.3099928, P = 0.00805). These results indicate that the metric R allele can capture the NMD activity change in tumors, but more sites are needed for making the metric effective enough.
Next we examine how these metrics are correlated with the expression of nine NMD effectors, including Upf1, Upf2, Upf3a, Upf3b, Smg1, Smg5, Smg6, Smg7 and Pnrc2 [23, 24]. Theoretically, we expect a negative correlation between the metrics of NMD activity and the expression of these factors, because a smaller metric value means stronger NMD activity conferred by higher expression of NMD effectors. We found that R mRNA is negatively correlated with the expression of smg1, smg7, upf2 and pnrc2, though not always statistically significant (Additional file 1: Table S5). Surprisingly, R mRNA is positively correlated with the expression of upf1, upf3a, and smg6 in some cases. These results suggest that NMD effectors may have contributed to NMD activity in a discordant way, with some effectors being more important than other in affecting NMD activity. However, it is unclear why some effectors such as upf1 are expressed at higher level where the NMD activity appears weaker. We also examine similar correlations by using R isoform , and none of these is significant and the signs of correlation are not always consistent with expectation (Additional file 1: Table S5). These together suggest that the change of NMD activity may not be consistent with the expression change of all NMD effectors, and thus a better understanding of NMD effectors in a regulatory network is needed for inferring NMD activity from the effectors’ expression.
A key step for applying NMD-inhibition based therapies is to measure NMD activity accurately. In this study, we try to reach this goal from three aspects. (1) We use the expression of many NMD target genes to infer NMD activity, because using a few genes may be less sensitive in detecting NMD activity change as the NMD-targeting efficiency varies among genes  and conditions, so using many genes imparts a better statistical power. (2) We develop three metrics to corroborate each other. Indeed, the three metrics are moderately correlated. Particularly, both R mRNA and R isoform show consistent and significant results in a few patients, such as LC_S39, LC_C34 and LC_S8. (3) We use the metric value in respective normal tissues to normalize that in tumors, through which we can eliminate biases introduced by factors that affect the expression in both normal and tumor samples (but also see next paragraph). For example, if an NMD-inducing allele is expressed lower than an NMD-free one in both normal and tumor samples due to associated polymorphic cis-regulatory elements, then normalization would eliminate this effect and avoid overestimating NMD activity in this case.
Despite our efforts, the method can be further improved in future. First, the NMD target gene set can be refined to improve sensitivity. We compiled the gene set based on mRNA expression change upon NMD inhibition. The set may include both direct and indirect NMD targets. Indirect targets however may not change their expression consistently when NMD activity changes, because their expression may be mainly regulated by other pathways rather than by NMD. Using only direct NMD targets may improve the accuracy of the method. Second, other NMD-inducing features may also be used to measure NMD activity. In our design, we use either the total mRNA expression of NMD target genes or the mRNA isoforms derived from alternative splicing or nonsense mutation alleles to monitor NMD activities. Theoretically, the presence of a uORF and the length of 3’UTR can also be used to select NMD target mRNAs and thus use their (relative) expression to monitor NMD activities. However, these features may not be as robust as the 50/55-bp rule (used for selecting NMD-targeted isoforms in our study), such as only translated uORFs can trigger NMD . Therefore, incorporating these features need further testing. Third, we assume that the NMD activity varies little in normal samples and use it to normalize that in tumors. This assumption may not be absolutely true and normalization may over- or under- estimate the true NMD activity in a tumor sample. It is therefore worthwhile to compare the performances of the metrics with and without normalization.
A good approach of measuring NMD activity is valuable for clinical applications as NMD influences a variety of physiological and pathological processes. For example, the high mutation rate of Upf1 in pancreatic adenosquamous carcinoma  and its downregulation in hepatocellular carcinoma  indicate that the NMD pathway could be frequently suppressed. Therefore, an estimate of NMD activity can guide NMD-based therapies. In lung adenocarcinoma, we find that the NMD activity can either decrease or increase compared to adjacent normal tissues, suggesting that NMD-inhibition based therapies may result in better effects in some patients than others. Furthermore, we don’t observe any correlation between NMD activity and tumor regression (Additional file 1: Table S6, P > 0.14), suggesting that NMD-inhibition based approach may be applicable to tumors of any stage. Actually, an estimate of NMD activity is also informative for diseases other than tumors.
For instance, “PTC read through” drugs have been used to restore the translation of PTC-containing transcripts for diseases such as cystic fibrosis and Duchenne muscular dystrophy etc. in several pilot clinical trials . An estimate of NMD activity in patients can therefore help personalized medicine.
We developed three metrics for inferring NMD activities based on RNA-seq data. Among them, the metric R isoform performs the best due to a moderate size of used target genes and using NMD-free splicing isoforms as a natural control. Our results suggest that NMD activity varies among patients and that the metrics may be used to assess NMD activities in other types of diseases.
- eIF2 α:
α subunit of eukaryotic initiation factor 2
Exon junction complex
Fragments per kilobase of transcript per million mapped reads
Gene Expression Omnibus
Nonsense-mediated mRNA decay
Premature termination codon
Whole transcriptome high-throughput sequencing
Reactive oxygen species
Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65(2):87–108.
Pastor F, Kolonias D, Giangrande PH, Gilboa E. Induction of tumour immunity by targeted inhibition of nonsense-mediated mRNA decay. Nature. 2010;465(7295):227–30.
Gilboa E. Expression of new antigens on tumor cells by inhibiting nonsense-mediated mRNA decay. Immunol Res. 2013;57(1–3):44–51.
Brogna S, Wen J. Nonsense-mediated mRNA decay (NMD) mechanisms. Nat Struct Mol Biol. 2009;16(2):107–13.
Wang D, Zavadil J, Martin L, Parisi F, Friedman E, Levy D, Harding H, Ron D, Gardner LB. Inhibition of nonsense-mediated RNA decay by the tumor microenvironment promotes tumorigenesis. Mol Cell Biol. 2011;31(17):3670–80.
Gardner LB. Hypoxic inhibition of nonsense-mediated RNA decay regulates gene expression and the integrated stress response. Mol Cell Biol. 2008;28(11):3729–41.
Gardner LB. Nonsense-mediated RNA decay regulation by cellular stress: implications for tumorigenesis. Mol Cancer Res. 2010;8(3):295–308.
Liu C, Karam R, Zhou Y, Su F, Ji Y, Li G, Xu G, Lu L, Wang C, Song M, et al. The UPF1 RNA surveillance gene is commonly mutated in pancreatic adenosquamous carcinoma. Nat Med. 2014;20(6):596–8.
Seo JS, Ju YS, Lee WC, Shin JY, Lee JK, Bleazard T, Lee J, Jung YJ, Kim JO, Shin JY, et al. The transcriptional landscape and mutational profile of lung adenocarcinoma. Genome Res. 2012;22(11):2109–19.
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: Accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14(4):R36.
Cunningham F, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fitzgerald S, et al. Ensembl 2015. Nucleic Acids Res. 2015;43(Database issue):D662–9.
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–5.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing S. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.
Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila Melanogaster strain w1118; iso-2; iso-3. Fly (Austin). 2012;6(2):80–92.
Yepiskoposyan H, Aeschimann F, Nilsson D, Okoniewski M, Muhlemann O. Autoregulation of the nonsense-mediated mRNA decay pathway in human cells. RNA. 2011;17(12):2108–18.
Tani H, Imamachi N, Salam KA, Mizutani R, Ijiri K, Irie T, Yada T, Suzuki Y, Akimitsu N. Identification of hundreds of novel UPF1 target transcripts by direct determination of whole transcriptome stability. RNA Biol. 2012;9(11):1370–9.
Schmidt SA, Foley PL, Jeong DH, Rymarquis LA, Doyle F, Tenenbaum SA, Belasco JG, Green PJ. Identification of SMG6 cleavage sites and a preferred RNA cleavage motif by global analysis of endogenous NMD targets in human cells. Nucleic Acids Res. 2015;43(1):309–23.
Cho H, Han S, Choe J, Park SG, Choi SS, Kim YK. SMG5-PNRC2 Is functionally dominant compared with SMG5-SMG7 in mammalian nonsense-mediated mRNA decay. Nucleic Acids Res. 2013;41(2):1319–28.
Lewis BP, Green RE, Brenner SE. Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proc Natl Acad Sci U S A. 2003;100(1):189–92.
Zetoune AB, Fontaniere S, Magnin D, Anczukow O, Buisson M, Zhang CX, Mazoyer S. Comparison of nonsense-mediated mRNA decay efficiency in various murine tissues. BMC Genet. 2008;9:83.
MacArthur DG, Balasubramanian S, Frankish A, Huang N, Morris J, Walter K, Jostins L, Habegger L, Pickrell JK, Montgomery SB, et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science. 2012;335(6070):823–8.
Linde L, Boelz S, Nissim-Rafinia M, Oren YS, Wilschanski M, Yaacov Y, Virgilis D, Neu-Yilik G, Kulozik AE, Kerem E, et al. Nonsense-mediated mRNA decay affects nonsense transcript levels and governs response of cystic fibrosis patients to gentamicin. J Clin Invest. 2007;117(3):683–92.
Chang Y-F, Imam JS, Wilkinson MF. The nonsense-mediated decay RNA surveillance pathway. Annu Rev Biochem. 2007;76:51–74.
Kervestin S, Jacobson A. NMD: a multifaceted response to premature translational termination. Nat Rev Mol Cell Biol. 2012;13(11):700–12.
Lindeboom RG, Supek F, Lehner B. The rules and impact of nonsense-mediated mRNA decay in human cancers. Nat Genet. 2016;48:1112–18.
Hurt JA, Robertson AD, Burge CB. Global analyses of UPF1 binding and function reveal expanded scope of nonsense-mediated mRNA decay. Genome Res. 2013;23(10):1636–50.
Chang L, Li C, Guo T, Wang H, Ma W, Yuan Y, Liu Q, Ye Q, Liu Z. The human RNA surveillance factor UPF1 regulates tumorigenesis by targeting Smad7 in hepatocellular carcinoma. J Exp Clin Cancer Res. 2016;35:8.
Keeling KM, Bedwell DM. Suppression of nonsense mutations as a therapeutic approach to treat genetic diseases. Wiley Interdiscip Rev RNA. 2011;2(6):837–52.
We thank Nobuyoshi Akimitsu and Hidenori Tani for providing the expression data of 9229 mRNAs in UPF1 knockdown, and Pei Hao and Jia Sheng for providing computing resources. We are grateful to Octavio M. Palacios-Gimenez and three reviewers for their constructive comments.
This work is supported by the National Natural Science Foundation of China (No. 31471224 and 81,030,015) and the National Basic Research Program of China (No. 2011CB510100). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Availability of data and materials
The dataset analyzed in this study are available in the NCBI GEO database (https://www.ncbi.nlm.nih.gov/geo/) with accession number GSE40419.
Ethics approval and consent to participate
Consent for publication
ZZ became an employee of Zymo Research Corp. during the paper review, and his new employment has no effect in paper preparation and publication.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This document contains all supplementary Tables and supplementary Figures with legends. (PDF 473 kb)
This document contains information of NMD target and non-target genes among patients. (TXT 4665 kb)
This document contains information of the alternative exons that can distinguish NMD target and non-target isoforms among patients. (TXT 845 kb)
This document contains information of the PTC inducing SNVs among patients. (TXT 25 kb)