 Research
 Open Access
 Published:
Deep learningbased cancer survival prognosis from RNAseq data: approaches and evaluations
BMC Medical Genomics volume 13, Article number: 41 (2020)
Abstract
Background
Recent advances in kernelbased Deep Learning models have introduced a new era in medical research. Originally designed for pattern recognition and image processing, Deep Learning models are now applied to survival prognosis of cancer patients. Specifically, Deep Learning versions of the Cox proportional hazards models are trained with transcriptomic data to predict survival outcomes in cancer patients.
Methods
In this study, a broad analysis was performed on TCGA cancers using a variety of Deep Learningbased models, including Coxnnet, DeepSurv, and a method proposed by our group named AECOX (AutoEncoder with Cox regression network). Concordance index and pvalue of the logrank test are used to evaluate the model performances.
Results
All models show competitive results across 12 cancer types. The last hidden layers of the Deep Learning approaches are lower dimensional representations of the input data that can be used for feature reduction and visualization. Furthermore, the prognosis performances reveal a negative correlation between model accuracy, overall survival time statistics, and tumor mutation burden (TMB), suggesting an association among overall survival time, TMB, and prognosis prediction accuracy.
Conclusions
Deep Learning based algorithms demonstrate superior performances than traditional machine learning based models. The cancer prognosis results measured in concordance index are indistinguishable across models while are highly variable across cancers. These findings shedding some light into the relationships between patient characteristics and survival learnability on a pancancer level.
Background
With the high prevalence of neural networks and Deep Learningbased algorithms in the Computational Biology, it is clear that the advantages of optimization in a highly nonlinear space are welcomed improvements in biomedicine [1,2,3,4,5,6,7]. In Bioinformatics, significant effort has been committed to harnessing transcriptomic data for multiple analyses [7,8,9,10,11,12,13] especially cancer survival prognosis [14, 15]. Faraggi and Simon [16] was the first study to use clinical information to predict prostate cancer survival through an artificial neural network model. Mobadersany et al. [17] integrated histological features, Convolutional Neural Networks (CNN), and genomics data to predict cancer prognosis via Cox regression. Despite of various existed applications on survival analysis such as [14, 15], the use of DeepLearning Cox models was pioneered by Ching et al. [18], who applied Cox regression with neural networks (Coxnnet) to predict survival using transcriptomic data became prevalent. Similarly, Katzman et al. [19] used DeepSurv with multilayer neural networks for survival prognosis and developed a personalized treatment recommendation system.
As a new and effective dimensionality reduction technique, the Autoencoder (AE) framework can lead to efficient lower dimensional representations using unsupervised or supervised learning [20,21,22,23,24]. In addition, Chaudhary et al. [25] also applied AE for dimensionality reduction and then used the lowdimensional representation of data to perform prognosis prediction using traditional method. In this paper, besides two recently developed Deep Learning based methods, namely Coxnnet and DeepSurv, we also attempted an Autoencoderbased approach (called AECOX) for cancer prognosis prediction with simultaneous learning of lower dimensional representation of inputs. This approach is similar to Coxnnet [18] and DeepSurv [19], as it implements neural networks with Cox regression, though the network architectures differ. In AECOX (Fig. 1c), the code from AE will link to a Cox regression layer for the prognosis. Both losses from the AE networks and Cox regression layer will be counted to train the entire network weights through backpropagation. AECOX is with symmetric structure of the Autoencoder, and can accept any number of hidden layers. We refer readers to the Additional file 1 for more detailed settings of AECOX.
To evaluate the prediction performance, we adopt two metrics, namely the concordance index and pvalue of logrank test. These metrics are used in comparing two stateoftheart Deep Learningbased prognosis models (i.e., Coxnnet, DeepSurv) with AECOX, in a pancancer study covering 12 TCGA (The Cancer Genome Atlas) cancers. In addition, we use Partitioning Around Medoids (PAM) clustering algorithm [26] on the last hidden layer for each model to evaluates how well the models discriminate subgroups in the lower dimensional space. Pvalue of logrank test based on K groups of KaplanMeier survival curve is the metric used for evaluation [27].
As we compared the prognosis prediction performance across 12 cancer types, we wonder whether the performance is related to tumor mutation burden and overall survival time. Tumor mutation burden (TMB) is a measurement of mutations in tumor [28, 29] and is an important genomic marker that is closely associate with immunotherapy and survival prognosis [30,31,32,33,34]. While incorporating TMB feature into input does not increase the prediction performances, we found that TMB is negatively correlated with overall survival time statistics, and both of them are correlated with the concordance index for all three models across cancer types, suggesting an association between TMB, overall survival time, and disease prognosis accuracy.
Overall, we observed comparative results across three different Deep Learningbased cancer survival prognosis models in terms of concordance index. We also investigated the lower dimensional representation that conveyed by Deep Learning algorithms. By inspecting the relationship between TMB, overall survival statistics, and concordance index across 12 cancer types, we confirmed an association among them, suggesting a future study direction of patient stratification and integrative analysis.
Method
Integrating Cox proportional hazards model with neural networks
The neural network architectures of all three Deep Learningbased approaches are provided in Fig. 1. Coxnnet (Fig. 1a) is the most succinct model with only one hidden layer, while DeepSurv (Fig. 1b) uses multiple hidden layers of consistent dimensions and treats the number of hidden layers as a hyperparameter. Similarly, AECOX also treats the number of hidden layers as a hyperparameter, but the hidden layers lay symmetrically in the encoder and decoder (Fig. 1c). All three models employed the same Cox proportional hazards model. However, Coxnnet and DeepSurv accept the output of the last hidden layer to the Cox model while AECOX uses the lowdimensional code as the input. The output hazard ratio was then compared to the ground truth and the details of evaluation metrics are provided earlier. The reason we introduced AECOX is to explore the feasibility of simultaneously generating a lowdimensional representation of the data while developing an effective model for prognosis.
The Cox proportional hazards model, also known as the Cox model, was developed to models the age specific failure rate or hazard function [35] at time t for patient i with covariate vector X_{i} by.
The partial likelihood L_{i} for patient i, which is defined to be the probability of occurrence of a death event at time Y_{i} for patient i, is found to be
at time Y_{i} for patient i. Where \( {\theta}_i=\exp \left(\sum \limits_{k=1}^K{\beta}_k{X}_{ik}\right) \). β = (β_{1}, β_{2}, …, β_{K}) are the K parameters to be estimated. The summation in denominator is carried out over all patients j (including patient i) for which a death event did not occur before time Y_{i}. The partial likelihood for all patients is then defined as
where C_{i} = 1 indicates the occurrence of a death event. The log partial likelihood of Cox model is then obtained as
Values of the parameters β = (β_{1}, β_{2}, …, β_{K}) are then obtained through maximum likelihood estimation (MLE), that is
Alternatively, since the Cox model utilizes a regression model that can be implemented as neural network with weights β = (β_{1}, β_{2}, …, β_{K}), values of these weights were obtained through backpropagation. This approach was embedded in all the aforementioned models and was denoted by the blue line with caption “CoxRegression Neural Network” in Fig. 1.
These models offer several advanced features: (1) a highly nonlinear function is learned, (2) neural networks and Cox proportional hazards regression are integrated together enabling the entire weights of the models to be learned through backpropagation, (3) the number of hidden layers and hidden layer dimensions were treated as hyperparameters that can be finetuned, and (4) dimensionality reduction in conjunction with supervised learning is achieved.
To demonstrate the advantages of Deep Learningbased prognosis models, we also compared three traditional machine learning based models for prognosis, they are: Cox proportional hazards model with R package “glmnet” [36], Random Survival Forest (RSF) [37], and Support Vector Machine (SVM) [25]. Particularly, in Chaudhary et al. [25], we implemented their SVM model according to the top 100 mRNAseq features selected from ANOVA (Analysis of variance) [38].
Regularization, loss functions and hyperparameters
Despite the fact that the aforementioned Deep Learningbased approaches shared the same Cox regression network and used the hazard ratio as the output (Table 1), yet certain differences existed among the models. Currently all three models used the L2 norm regularization in the final learning after hyperparameters tuning as it gave the optimal validation accuracy. While all models attempted Dropout and the L2 norm regularization (Ridge Regularization [39]) to penalize the network weights, AECOX also included L1 norm regularization (Least Absolute Shrinkage and Selection Operator, LASSO in short [40]) and elastic net [41].
The structure of loss functions among models shared a common base formula (Table 1), but each approach used additional penalization. Specifically, both Coxnnet and DeepSurv used the same objective (loss) function:
whereas AECOX took into account the Autoencoder’s inputoutput difference:
Here Θ denotes to the neural networks’ weights to be learned, including hidden layer weights and Cox regression neural network weights, X_{input} and X_{output} are the input and output covariate vectors of Autoencoder, respectively. MSE(∙) is the mean squared error function. The hyperparameter λ_{1} balances the loss between Autoencoder’s inputoutput difference which is a measure of dimensionality reduction and the Cox hazard, which is a measure of regression based supervised learning. The combination of λ_{2} and λ_{3} permits the utilization of Elastic Net regularization. Forcing λ_{2} = 0 results in L2 regularization, whereas forcing λ_{3} = 0 results in L1 regularization. To optimize the objective functions given above, Coxnnet, DeepSurv, and AECOX use Nesterov accelerated gradient descent [42], stochastic gradient descent (SGD) [43], and adaptive moment estimation (Adam) optimizer [44], respectively. AECOX adopted Adam optimizer as it is more computationally efficient and require little tuning on hyperparameters.
As shown in Table 1, Coxnnet has one hyperparameter to be finetuned, and thus a linear search technique was adopted, whereas DeepSurv and AECOX had multiple hyperparameters in a high dimensional space. It is thus unrealistic to perform a linear search in each dimension of the hyperparameter space as the computational complexity would be O(n^{p}) for p hyperparameters. Instead, DeepSurv and AECOX utilize the Sobol solver [45] in the Optunity python package [46]. Given a search time q (e.g., q = 100), the Sobol solver samples q points assuming the hyperparameters are uniformly distributed in pdimensional space. This reduces the computational complexity to O(nq), regardless of how large the value of p is.
Data preprocessing and statistics
Genes with lowest 20% absolute expression values and lowest 10% variance across samples were removed. This denoising step was performed via the TSUNAMI package (https://apps.medgen.iupui.edu/rsc/tsunami/) [15], ensuring model robustness and reducing irrelevant noise.
The expression data were then rescaled with natural logarithm operation:
where X_{original} was the original nonnegative RNA sequencing expression values (Illumina HiSeq RNAseq v2 RSEM normalized), and X_{input} was the input covariate vector for the models. Subsequently each gene expression at row r in the input data was normalized as
This step ensured that each row of the gene expression contributed to the model on an equal scale.
Table 2 provides a summary of the median and range in terms of age and survival months for the TCGA data. Each dataset was split into training, validation, and testing sets in a proportion of 60, 20, and 20% respectively. Confounding effects [47] were minimized by randomly shuffling the data 1000 times and choosing the 5 pairs of training/validation/testing sets with lowest corresponding differences. The differences that were minimized is the summation of (1) standard deviation of male/female ratio on training/validation/testing sets, (2) standard deviation of overall survival time’s standard deviation on training/validation/testing sets, (3) standard deviation of overall survival time’s mean on training/validation/testing sets, (4) standard deviation of the ratio of deceased group to whole population on training/validation/testing sets, and (5) standard deviation of the ratio of tumor stages to whole population on training/validation/testing sets. Thus, survival prognosis was estimated for each cancer type 5 times.
In this study, TCGA mutation annotation files (MAFs), containing subsets of the patients for prognosis tasks, were used to calculate TMB summary statistics, including mean, median, max, and 20, 10, 5% tail cut values. These characteristics were used for examining correlation between TMB and concordance index.
Evaluation metrics
We evaluated model performance with concordance index and the pvalue of logrank test. Concordance index had been widely used for evaluating survival prognosis models [48,49,50]. Its value ranges from 0 to 1 and it describes how well models differentiated groups (censored and uncensored groups, or living and deceased groups) [50,51,52,53]. A concordance index of 0.5 indicates that a model was ineffective and is viewed to have generated a random prediction with respect to ground truth. Values above 0.5 indicate improved prediction by a model, with increased performance being conveyed by a concordance index approaching 1. Values below 0.5 indicate that a model predicted values that are the opposite of the ground truth. Higher concordance index values indicate better capability of model to perform cancer survival prognosis.
Pvalues were derived by dichotomizing the hazard ratios through median value and performing logrank tests [54,55,56] between the resulted highrisk and lowrisk groups. Model performance was then assessed wherein a lower pvalue represents an enhanced ability to distinguish two patient groups.
To evaluate the performances across cancer types and across model types, twoway ANOVA [38] is adopted. Pairwise paired ttest [57, 58] and the linear mixedeffects models test from the R package “nlme” [59, 60] are also used. The linear mixedeffects models test is to test between pairs of models while accounting for random effects. The mixed effect model assumed the data (performances) to be dependent within each cancer type and independent across cancer types.
Results
The performance comparison was conducted at pancancer level using 12 cancer from The Cancer Genome Atlas (TCGA). These 12 cancers were chosen due to their relatively large sample sizes and sufficient information about patient outcomes. The specific cancers analyzed in this paper were (1) Urothelial Bladder Carcinoma (BLCA); (2) Breast Invasive Carcinoma (BRCA); (3) Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC); (4) HeadNeck Squamous Cell Carcinoma (HNSC); (5) Kidney Renal Clear Cell Carcinoma (KIRC); (6) Kidney Renal Papillary Cell Carcinoma (KIRP); (7) Liver Hepatocellular Carcinoma (LIHC); (8) Lung Adenocarcinoma (LUAD); (9) Lung Squamous Cell Carcinoma (LUSC); (10) Ovarian Cancer (OV); (11) Pancreatic Adenocarcinoma (PAAD); and (12) Stomach Adenocarcinoma (STAD). In this paper, we used the expression data of Illumina HiSeq RNAseq v2 RSEM normalized genes from TCGA.
Performance comparison
Figure 2a and b present concordance indices and pvalues of logrank tests among the different models and different cancer datasets, wherein the cancers in xaxis were sorted based on the averaged concordance index values among all models and experiments. It is observed that models for cancers like KIRP, BRCA, and LIHC yield median concordance indices of at least 0.7, whereas some cancers like STAD and LUSC yield median concordance indices of approximately 0.5. This led to our further investigation with tumor mutation burden (TMB) and overall survival time as described earlier. We also made a comparison between three traditional machine learning models (Fig. 2c, d). Specifically, we presented the results in Fig. 2 as two parts in order to directly visualize the comparison between Deep Learningbased models and traditional machine learning based models in Fig. 2c and d.
Since five experiments were carried out for each cancer type and each model type, we compared the performances (via concordance index and pvalue of logrank test) for all 12 TCGA cancer types using pairwise paired ttest among all models (Table 3a) and the linear mixedeffects models test (Table 3b). In this case we considered a model to be better than another if a higher concordance index or a lower pvalue of logrank test was observed. Thus, a positive tstatistic in Table 3a or a positive coefficient in Table 3b was used to conclude that the model (distribution 1) was better than the other (distribution 2) with respect to the concordance index. In the case of the pvalue of logrank test, a negative tstatistic or coefficient was used to reach the same inference.
As can be observed from Table 3b, all models have a similar performance since most of the test results of their linear mixedeffects models are insignificant. Both Table 3a and Table 3b concluded that among Deep Learningbased approaches, Coxnnet provided the overall optimal survival prognosis results at pancancer level, with respect to the concordance index and the pvalue of logrank test. This advantage of Coxnnet is due to a simpler neural network architecture and reduced search space for hyperparameters. Additional file 1: Table S10S11 presented the same quantitative comparison of performances for Deep Learningbased and traditional machine learning models. All three Deep Learning models demonstrated superior performance than traditional machine learning models, suggesting the advantages of Deep Learning approaches on prognosis prediction.
Lower dimensional representation
The final hidden layer (or the code in AECOX), highlighted in orange in Fig. 1, produces a lower dimensional representation of the input and is one of the intrinsic properties in Deep Learningbased algorithms [18, 21, 61, 62]. By using the Partitioning Around Medoids (PAM) clustering algorithm [26] on the output of the last hidden layer after the network is trained, we can then inspect the original covariate vector in a lower dimensional space. The most suitable number of clusters (ranging from 2 to 10) was determined by maximizing the averaged silhouette score [63, 64]. As depicted in Table 4, Coxnnet appeared to have overall better pvalues of the logrank test measured between different clusters, indicating a better capacity of dimensionality reduction for 9 cancers (KIRP, KIRC, LIHC, BRCA, CESC, LUAD, HNSC, OV, LUSC).
Relationship between prognosis prediction performances and tumor mutation burden
From the performances within a cancer type across models in Fig. 2 and results in Table 3, it appeared that all models achieve respectable performances measured by concordance index. We also found that performance (concordance index) was more significantly associated with cancer types than algorithms (twoway ANOVA: Cancer type pvalue <2E16, Model type pvalue = 9.57E02). This observation suggests that intrinsic characteristics of different cancer types have a large influence on the performance of prognosis models. One such characteristics is the tumor mutation burden (TMB), which is known to vary largely between different types of cancers.
TMB was increasingly used as a marker in predicting efficacy of immunotherapy [33] and was also shown to be a predictor of prognosis [34]. Since the ability to train a cancer survival prognosis model across cancer types varies significantly, we explored whether TMB can be associated with these changes. By inspecting the mutation information associated with different cancer types, we observed that the performance of survival prognosis models was associated with tumor mutation burden (TMB) characteristics. Specifically, we observed that all TMB characteristics were negatively correlated with concordance index especially the mean TMB (Mean TMB: Pearson ρ = − 0.45 (Fig. 3b); Median TMB: Pearson ρ = − 0.30; Maximum TMB: Pearson ρ = − 0.40; 20% tail TMB: Pearson ρ = − 0.32; 10% tail TMB: Pearson ρ = − 0.32; 5% tail TMB: Pearson ρ = − 0.30).
One interesting question is then if the incorporating TMB in the model would enhance the model performance. To investigate this, we take the joint subset of patients who have both RNAseq data and TMB data, performed survival prognosis with Coxnnet model (the method which has the best performance) with and without TMB feature, respectively. As shown in Fig. 4, although there is a slight improvement on concordance index (average value = 0.003419) after TMB feature is incorporated, the correlation between improved concordance index (mean) and mean TMB values is 0.0688 across 12 TCGA cancers, suggesting that introducing TMB feature into a mRNAseq based learning model does not substantially improve the performance for Coxnnet.
Next, we found the correlation between the mean of overall survival times and the mean of TMB values is − 0.6853 (Pearson) and − 0.7133 (Spearman) across 12 cancers, and the correlation between the variance of overall survival times and the variance of TMB values is − 0.6159 (Pearson) and − 0.2448 (Spearman), suggesting a strong correlation between higher TMB and shorter overall survival times statistics. Where the correlation between the mean of overall survival times and the mean of concordance index is 0.4271 (Pearson) and 0.4126 (Spearman).
Discussion
Overall our study demonstrated that the Deep Learning architecture can be effectively applied for cancer prognosis prediction with Coxproportional hazard model incorporated. We found that Deep Learningbased model demonstrated superior performances comparing to traditional machine learning models. Among the three Deep Learningbased models tested, we observed that Coxnnet, which has the most succinct neural network structure, resulted in better prognosis performances in the measurement of concordance index and pvalue of logrank test. We showed that integrating autoencoder with Cox regression network does not significantly improve the prognosis performances. These results highlight an important issue in Deep Learning approaches—namely simpler models often perform similar or better to more complex models in biological data.
From the associated finetuned hyperparameters (Additional file 1: Table S5S9) during the hyperparameters tuning (with optimal validation accuracy), we found that Deep Learningbased algorithms and traditional machine learning based algorithms, especially with multiple hyperparameters, tends to converge into different local minima with different hyperparameter values. For example, the optimal parameter pairs of AECOX are not consistent in five different folds even when these experiments are from same cancer (e.g., TCGA BRCA cancer). This result can potentially be due to the curse of dimensionality [65]: with limited number of different training samples and large number of parameters (e.g., the hidden layer weights in Deep Learningbased models), the optimization may not guarantee to converge to same local minima. These observations lead us to rethink the robustness of training procedure – especially when higher performances are observed on Coxnnet where it has the least hyperparameter tuning effort.
We also noticed a negative correlation between TMB values and prognosis prediction performances. The relationship between TMB and prognosis have been examined in existing literatures in cancer biology for individual cancer types. For example, OwadaOzaki et al. [66] examined the relationship between individual TMB and prognosis and concluded that high TMB is a poor prognostic factor in nonsmall cell lung cancer (NSCLC). A similar pattern occurs between TMB and prognosis specifically for lung adenocarcinomas (a subtype of NSCLC) (Naidoo et al. [67]). Our pancancer analyses are agreeing with these findings yet have a different conclusion. We observed that TMB is correlated with prognosis performances (concordance index), however, integrating TMB to the Coxnnet model does not improve the performances at the pancancer level. By further examining the relationships behind these features and results, we found that TMB is highly correlated with overall survival times (both in mean and variance) across cancer types. Specifically, lower TMB value is associated with longer mean overall survival time, concluded that TMB is a marker for tumor malignancy. These findings lead us to speculate that TMB either affect or are affected by overall survival time, but may not directly contribute to prognosis prediction when gene expression data are used. However, with a strong correlation to TMB, shorter overall survival times leads to worse prognosis performance, suggesting a direct relationship between overall survival statistics and prognosis performances. These findings will guide us to design future experiments to further explain the detailed relationships especially the dependency among TMB, survival times, and prognosis performances at pancancer level.
Conclusion
Bringing artificial intelligence into clinical and cancer studies [6, 68,69,70] can unravel numerous interpretabilities behind the data. In this paper, we focused on three different Deep Learningbased cancer prognosis models. The survival predictions are conducted across 12 TCGA cancer types with sufficient number of patients and survival information. We found that Deep Learning based algorithms demonstrate superior performances than traditional machine learning based models. We also found that the cancer prognosis results measured in concordance index are indistinguishable across models while are highly variable across cancers by twoway ANOVA. The highest concordance index that models can predict is renal papillary cell carcinoma (KIRP), while the lowest concordance index is observed for lung squamous cell carcinoma (LUSC). We then examined the relationships between TMB statistics, overall survival statistics, and concordance indices across 12 cancers. We found that although TMB and overall survival times are negatively correlated with concordance indices across the cancer types, integrating TMB does not improve the prognosis prediction performance for individual cancers significantly, whereas TMB has a strong correlation with overall survival times. These findings will guide us to explore the relationships between patient characteristics and survival learnability in a pancancer level in the future work.
Availability of data and materials
The results shown here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga. All mRNAseq data were based on illuminahiseq_rnaseqv2RSEM_genes_normalized from Broad GDAC Firehose (https://gdac.broadinstitute.org/) transcriptomic data as the inputs to the models.
Abbreviations
 ANOVA:

Analysis of variance
 MLE:

Maximum likelihood estimation
 PAM:

Partitioning around medoids
 TCGA:

The Cancer Genome Atlas
 TMB:

Tumor mutation burden
 TSUNAMI:

Translational Bioinformatics Tool Suite For Network Analysis And Mining
References
 1.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.
 2.
Min S, Lee B, Yoon S. Deep learning in bioinformatics. Brief Bioinform. 2017;18(5):851–69.
 3.
Leung MK, Xiong HY, Lee LJ, Frey BJ. Deep learning of the tissueregulated splicing code. Bioinformatics. 2014;30(12):i121–9.
 4.
Chen Y, Li Y, Narayan R, Subramanian A, Xie X. Gene expression inference with deep learning. Bioinformatics. 2016;32(12):1832–9.
 5.
Alipanahi B, Delong A, Weirauch MT, Frey BJ. Predicting the sequence specificities of DNA and RNAbinding proteins by deep learning. Nat Biotechnol. 2015;33(8):831–8.
 6.
Huang Z, Zhan XH, Xiang SN, Johnson TS, Helm B, Yu CY, Zhang J, Salama P, Rizkalla M, Han Z, et al. SALMON: survival analysis learning with multiOmics neural networks on breast Cancer. Front Genet. 2019;10.
 7.
Johnson TS, Li SH, Franz E, Huang Z, Li SYD, Campbell MJ, Huang K, Zhang Y. PseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancers. Gigascience. 2019;8(5),giz046:113.
 8.
Yu CY, Xiang S, Huang Z, Johnson TS, Zhan X, Han Z, Abu Zaid MI, Huang K. Gene Coexpression Network and Copy Number Variation Analyses Identify Transcription Factors Involved in Multiple Myeloma Progression. Front Genet. 2019;10:468.
 9.
Feng C, Huang H, Huang S, Zhai YZ, Dong J, Chen L, Huang Z, Zhou X, Li B, Wang LL, et al. Identification of potential key genes associated with severe pneumonia using mRNAseq. Exp Ther Med. 2018;16(2):758–66.
 10.
Huang S, Feng C, Chen L, Huang Z, Zhou X, Li B, Wang LL, Chen W, Lv FQ, Li TS. Molecular mechanisms of mild and severe pneumonia: insights from RNA sequencing. Med Sci Monit. 2017;23:1662–73.
 11.
Xiang S, Huang Z, Wang T, Han Z, Yu CY, Ni D, Huang K, Zhang J. Conditionspecific gene coexpression network mining identifies key pathways and regulators in the brain tissue of Alzheimer's disease patients. BMC Med Genet. 2018;11(Suppl 6):115.
 12.
Zhan XH, Cheng J, Huang Z, Han Z, Helm B, Liu XW, Zhang J, Wang TF, Ni D, Huang K. Correlation analysis of histopathology and Proteogenomics data for breast Cancer. Mol Cell Proteomics. 2019;18:S37–51.
 13.
Helm BR, Zhan X, Pandya PH, Murray ME, Pollok KE, Renbarger JL, Ferguson MJ, Han Z, Ni D, Zhang J, et al. Gene CoExpression Networks Restructured Gene Fusion in Rhabdomyosarcoma Cancers. GenesBasel. 2019;10(9):665.
 14.
Huang S, Yang H, Li Y, Feng C, Gao L, Gf C, Hh G, Huang Z, Yh L, Yu L. Prognostic significance of mixedlineage leukemia (MLL) gene detected by realtime fluorescence quantitative PCR assay in acute myeloid leukemia. Med Sci Monit. 2016;22:3009.
 15.
Shao W, Wang T, Huang Z, Cheng J, Han Z, Zhang D, Huang K. DiagnosisGuided Multimodal Feature Selection for Prognosis Prediction of Lung Squamous Cell Carcinoma. In: International Conference on Medical Image Computing and ComputerAssisted Intervention: 1317 October 2019. Shenzhen: Springer; 2019. p. 113–21.
 16.
Faraggi D, Simon R. A neuralnetwork model for survivaldata. Stat Med. 1995;14(1):73–82.
 17.
Mobadersany P, Yousefi S, Amgad M, Gutman DA, BarnholtzSloan JS, Vega JEV, Brat DJ, Cooper LAD. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A. 2018;115(13):E2970–9.
 18.
Ching T, Zhu X, Garmire LX. Coxnnet: An artificial neural network method for prognosis prediction of highthroughput omics data. PLoS Comput Biol. 2018;14(4):e1006076.
 19.
Katzman JL, Shaham U, Cloninger A, Bates J, Jiang TT, Kluger Y. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol. 2018;18:24.
 20.
Liou CY, Cheng WC, Liou JW, Liou DR. Autoencoder for words. Neurocomputing. 2014;139:84–96.
 21.
Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science. 2006;313(5786):504–7.
 22.
Van Der Maaten L, Postma E, den Herik V. Dimensionality reduction: a comparative. J Mach Learn Res. 2009;10:66–71.
 23.
Sakurada M, Yairi T. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis: 2014: ACM; 2014. p. 4.
 24.
Wang W, Huang Y, Wang YZ, Wang L. Generalized Autoencoder: A Neural Network Framework for Dimensionality Reduction. 2014 Ieee Conference on Computer Vision and Pattern Recognition Workshops (Cvprw); 2014. p. 496.
 25.
Chaudhary K, Poirion OB, Lu L, Garmire LX. Deep learningbased multiOmics integration robustly predicts survival in liver Cancer. Clin Cancer Res. 2018;24(6):1248–59.
 26.
Kaufman L, Rousseeuw PJ. Partitioning around medoids (program pam). Finding groups in data: an introduction to cluster analysis; 1990. p. 68–125.
 27.
Efron B. Logisticregression, survival analysis, and the KaplanMeier curve. J Am Stat Assoc. 1988;83(402):414–25.
 28.
Alexandrov LB, NikZainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, BorresenDale AL, et al. Signatures of mutational processes in human cancer. Nature. 2013;500(7463):415–21.
 29.
Yuan J, Hegde PS, Clynes R, Foukas PG, Harari A, Kleen TO, Kvistborg P, Maccalli C, Maecker HT, Page DB, et al. Novel technologies and emerging biomarkers for personalized cancer immunotherapy. J Immunother Cancer. 2016;4:3.
 30.
Birkbak NJ, Kochupurakkal B, Izarzugaza JM, Eklund AC, Li Y, Liu J, Szallasi Z, Matulonis UA, Richardson AL, Iglehart JD. Tumor mutation burden forecasts outcome in ovarian cancer with BRCA1 or BRCA2 mutations. PLos one. 2013;8(11):e80023.
 31.
Chalmers ZR, Connelly CF, Fabrizio D, Gay L, Ali SM, Ennis R, Schrock A, Campbell B, Shlien A, Chmielecki J, et al. Analysis of 100,000 human cancer genomes reveals the landscape of tumor mutational burden. Genome Med. 2017;9(1):34.
 32.
Spigel DR, Schrock AB, Fabrizio D, Frampton GM, Sun J, He J, Gowen K, Johnson ML, Bauer TM, Kalemkerian GP. Total mutation burden (TMB) in lung cancer (LC) and relationship with response to PD1/PDL1 targeted therapies. In: American Society of Clinical Oncology; 2016.
 33.
Goodman AM, Kato S, Bazhenova L, Patel SP, Frampton GM, Miller V, Stephens PJ, Daniels GA, Kurzrock R. Tumor mutational burden as an independent predictor of response to immunotherapy in diverse cancers. Mol Cancer Ther. 2017;16(11):2598–608.
 34.
Simpson D, Ferguson R, Martinez CN, Kazlow E, Moran U, Heguy A, Hanniford D, Hernando E, Osman I, Kirchhoff T. Mutation burden as a potential prognostic marker of melanoma progression and survival. In: American Society of Clinical Oncology; 2017.
 35.
Cox D. Regression models and life tables. Statist Soc B. 1972;1972(34):187–202.
 36.
Simon N, Friedman J, Hastie T, Tibshirani R. Regularization paths for Cox's proportional hazards model via coordinate descent. J Stat Softw. 2011;39(5):1–13.
 37.
Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random survival forests. Ann Appl Stat. 2008;2(3):841–60.
 38.
Anderson MJ. A new method for nonparametric multivariate analysis of variance. Austral Ecology. 2001;26(1):32–46.
 39.
Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics. 2000;42(1):80–6.
 40.
Tibshirani R. Regression shrinkage and selection via the Lasso. J Royal Stat Soc Series BMethodological. 1996;58(1):267–88.
 41.
Zou H, Hastie T. Regularization and variable selection via the elastic net. J Royal Stat Soc Series BStatistical Methodology. 2005;67:301–20.
 42.
Nitanda A. Stochastic proximal gradient descent with acceleration techniques. In: Advances in Neural Information Processing Systems, vol. 2014; 2014. p. 1574–82.
 43.
Bottou L. LargeScale Machine Learning with Stochastic Gradient Descent. Compstat'2010: 19th International Conference on Computational Statistics; 2010. p. 177–86.
 44.
Kingma DP, Ba JL. Adam: A method for stochastic optimization. In: Proc 3rd Int Conf Learn Representations; 2014. p. 2014.
 45.
Sobol IM: Uniformly distributed sequences with an additional uniform property. USSR Computational Mathematics Mathematical Physics 1976, 16(5):236–242.
 46.
Claesen M, Simm J, Popovic D, Moreau Y, De Moor B. Easy hyperparameter search using Optunity. arXiv preprint; 2014.
 47.
Pourhoseingholi MA, Baghestani AR, MJG V. How to control confounding effects by statistical analysis. Gastroenterol Hepatol Bed Bench. 2012;5(2):79.
 48.
Brentnall AR, Cuzick J. Use of the concordance index for predictors of censored survival data. Stat Methods Med Res. 2018;27(8):2359–73.
 49.
Mayr A, Schmid M. Boosting the Concordance Index for Survival Data  A Unified Framework To Derive and Evaluate Biomarker Combinations. PLoS One. 2014;9(1):e84483.
 50.
Gerds TA, Kattan MW, Schumacher M, Yu C. Estimating a timedependent concordance index for survival prediction models with covariate dependent censoring. Stat Med. 2013;32(13):2173–84.
 51.
Mann HB, Whitney DR. On a test of whether one of two random variables is stochastically larger than the other. Ann Mathematical Stat. 1947;18(1):50–60.
 52.
Wilcoxon F. Individual comparisons by ranking methods. Biom Bull. 1945;1(6):80–3.
 53.
Steck H, Krishnapuram B, Dehingoberije C, Lambin P, Raykar VC. On ranking in survival analysis: bounds on the concordance index. In: Advances in neural information processing systems, vol. 2008; 2008. p. 1209–16.
 54.
Mantel N. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother Rep. 1966;50(3):163–70.
 55.
Peto R, Peto J. Asymptotically efficient rank invariant test procedures. J Royal Stat Soc Series A. 1972;135(2):185–207.
 56.
Harrington D. Linear rank tests in survival analysis. Encyclopedia Biostatist. 2005;4:113.
 57.
Hsu H, Lachenbruch PA. Paired t test. Wiley StatsRef: Statistics Reference Online; 2014.
 58.
David HA, Gunnink JL. The paired t test under artificial pairing. Am Stat. 1997;51(1):9–12.
 59.
Pinheiro J, Bates D, DebRoy S, Sarkar D, Team RC: Linear and nonlinear mixed effects models 2007, 3(57):1–89.
 60.
Reese RA, Welsh KB, Galecki AT. Linear mixed models: a practical guide using statistical software. J Royal Stat Soc Series aStat Soc. 2008;171:318.
 61.
Fodor IK. JCfASC, Lawrence Livermore National Laboratory: A survey of dimension reduction techniques, vol. 9; 2002. p. 1–18.
 62.
Tan SF, Mavrovouniotis ML. Reducing data dimensionality through optimizing neuralnetwork inputs. AICHE J. 1995;41(6):1471–80.
 63.
Rousseeuw PJ. Silhouettes  a graphical aid to the interpretation and validation of clusteranalysis. J Comput Appl Math. 1987;20:53–65.
 64.
Kodinariya TM, Makwana PR. Review on determining number of Cluster in KMeans Clustering. Int J. 2013;1(6):90–5.
 65.
Poggio T, Mhaskar H, Rosasco L, Miranda B, Liao Q. Why and when can deepbut not shallownetworks avoid the curse of dimensionality: a review. Int J Autom Comput. 2017;14(5):503–19.
 66.
OwadaOzaki Y, Muto S, Takagi H, Inoue T, Watanabe Y, Fukuhara M, Yamaura T, Okabe N, Matsumura Y, Hasegawa T, et al. Prognostic impact of tumor mutation burden in patients with completely resected nonsmall cell lung Cancer: brief report. J Thorac Oncol. 2018;13(8):1217–21.
 67.
Naidoo J, Wang X, Woo KM, Iyriboz T, Halpenny D, Cunningham J, Chaft JE, Segal NH, Callahan MK, Lesokhin AM, et al. Pneumonitis in Patients Treated With AntiProgrammed Death1/Programmed Death Ligand 1 Therapy. J Clin Oncol. 2017;35(7):709.
 68.
Huang Z, Han Z, Parwani A, Huang K, Li ZB. Predicting response to neoadjuvant chemotherapy in HER2positive breast cancer using machine learning models with combined tissue imaging and clinical features. Laboratory investigation. 2019;99.
 69.
Huang Z, Tgavalekos K, Zhao C. 221: AIdriven forecasting of mean pulmonary artery pressure for the management of cardiac patients. Crit Care Med. 2020;48(1):93.
 70.
Wang T, Johnson TS, Shao W, Lu Z, Helm BR, Zhang J, Huang K. BERMUDA: a novel deep transfer learning method for singlecell RNA sequencing batch correction reveals hidden highresolution cellular subtypes. Genome Biol. 2019;20(1):115.
Acknowledgements
Not applicable.
About this supplement
This article has been published as part of BMC Medical Genomics Volume 13 Supplement 5, 2020: The International Conference on Intelligent Biology and Medicine (ICIBM) 2019: Computational methods and application in medical genomics (part 1). The full contents of the supplement are available online at https://bmcmedgenomics.biomedcentral.com/articles/supplements/volume13supplement5 .
Funding
This work is partially supported by IUSM startup fund (SC, CZ), Indiana University Precision Health Initiative (KH, ZJ, ZHuang, ZHan), NLM F31 Fellowship (TJ), and Shenzhen Peacock Plan KQTD2016053112051497 (JC, SX, XZ). Publication costs are funded by the Indiana University Precision Health Initiative.
Author information
Affiliations
Contributions
ZHuang conceived and designed the algorithm and analysis, conducted the experiments, and wrote the paper. TJ, Zhan, CY, JC, SX, XZ collected the data. TJ, BH, KH edited the paper. SC, CZ, JZ, PS, MR, ZHan, KH provided the research guide. PS, KH supervised this project. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file
Additional file 1: Figure S1.
An example framework of the AECOX model with four hidden layers. Table S1. The network design of AECOX. Table S2. The hyperparameters of AECOX to be searched. Table S3. Performances of testing set in TCGA Kidney Renal Clear Cell Carcinoma (KIRC) dataset. Bolded texts indicated optimal results among all models. Table S4. Individual model correlations (Pearson ρ) of mean TMB (Fig. 2). Figure S2. Relationship between concordance index and median TMB. Pearson ρ = − 0.30 (pvalue = 7.75E02). Figure S3. Relationship between concordance index and max TMB. Pearson ρ = − 0.40 (pvalue = 1.68E02). Figure S4. Relationship between concordance index and 20% tail TMB. Pearson ρ = − 0.32 (pvalue = 5.51E02). Figure S5. Relationship between concordance index and 10% tail TMB. Pearson ρ = − 0.32 (pvalue = 5.93E02). Figure S6. Relationship between concordance index and 5% tail TMB. Pearson ρ = − 0.30 (pvalue = 7.45E02). Table S5. Finetuned hyperparameters of Coxnnet (L2 penalty weight λ) across 12 cancer types and 5 experiments (folds). Table S6. Finetuned hyperparameters of DeepSurv across 12 cancer types and 5 experiments (folds). Table S7. Finetuned hyperparameters of AECOX across 12 cancer types and 5 experiments (folds). Note that we fixed λ_{2} = 0 to only impose L2 sparsity. Table S8. Finetuned hyperparameters of Random Survival Forest (RSF) (number of the trees) across 12 cancer types and 5 experiments (folds). Table S9. Finetuned hyperparameters of SVM (α, weight of penalizing the squared hinge loss in the objective function) across 12 cancer types and 5 experiments (folds). Table S10. Modelwised performances comparison at pancancer level (12 TCGA (The Cancer Genome Atlas) cancer types) by pairwise paired ttest, according to metrics concordance index and pvalue of logrank test. Note that for concordance index, larger tstatistic/coefficient indicated better performance at pancancer level, while the pvalue of logrank test was on the contrary. Table S11. Modelwised performances comparison at pancancer level (12 TCGA (The Cancer Genome Atlas) cancer types) by linear mixedeffects models test, according to metrics concordance index and pvalue of logrank test. Note that for concordance index, larger tstatistic/coefficient indicated better performance at pancancer level, while the pvalue of logrank test was on the contrary.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Huang, Z., Johnson, T.S., Han, Z. et al. Deep learningbased cancer survival prognosis from RNAseq data: approaches and evaluations. BMC Med Genomics 13, 41 (2020). https://doi.org/10.1186/s1292002006861
Published:
DOI: https://doi.org/10.1186/s1292002006861
Keywords
 Deep learning
 Cancer prognosis
 Survival analysis
 Tumor mutation burden
 Cox regression