From: Deep learning-based cancer survival prognosis from RNA-seq data: approaches and evaluations
Properties | Models | ||
---|---|---|---|
Cox-nnet | DeepSurv | AECOX | |
Deep Learning Architecture | Single-layer neural networks | Multi-layer neural networks | Multi-layer Autoencoder neural networks |
Deep Learning Programming Framework | Theano | Theano, Lasagne | PyTorch |
Hyper-parameters | L2 regularization weight λ. | Learning rate; Number of hidden layers; Hidden layer sizes; Learning rate decay; Momentum; L2 regularization weight λ; Dropout rate. | Learning rate; Autoencoder input-output error weight λ1; L1 regularization weight λ2; L2 regularization weight λ3; Dropout rate; Number of hidden layers; Regularization method. |
Hyper-parameters Searching Methods | Line search | Sobol solver | Sobol solver |
Number of iterations for searching hyper-parameters | 12 | 100 | 100 |
Maximum epochs | 4000 | 500 | 300 |
Number of Hidden Layers | 1 | 1, 2, 3, or 4 | 0, 2, 4, 6, or 8 |
Last hidden Layer sizes | Integer value in range [131, 135] | Integer value in range [30, 50] | 16 |
Regularization Methods | L1, L2, Dropout | L2, Dropout | Dropout, L1, L2, Elastic Net |
Basic Objective (Loss) Functions | \( \hat{\Theta}={\mathrm{argmin}}_{\Theta}\left\{{\sum}_{i:{C}_i=1}\left(\sum \limits_{k=1}^K{\beta}_k{X}_{ik}-\log \left({\sum}_{j:{Y}_j\ge {Y}_i}{\theta}_j\right)\right)\right\} \) | ||
Optimization Methods | Nesterov accelerated gradient descent | Stochastic gradient descent (SGD) | Adaptive Moment Estimation (Adam) |
Network Architectures | (Input Layer) – (Hidden Layer) (tanh) – (Hazard Ratio) | (Input Layer) – (Hidden Layer) (ReLU/SELU) – … – (Hidden Layer) (ReLU/SELU) – (Hazard Ratio) | (Input Layer) – (Hidden Layers) (ReLU/Dropout) – (Code) – (Hidden Layers) (ReLU/Dropout) – (Output Layer); (Code) (tanh) – (Hazard Ratio) |