Skip to main content
Fig. 4 | BMC Medical Genomics

Fig. 4

From: Quality control recommendations for RNASeq using FFPE samples based on pre-sequencing lab metrics and post-sequencing bioinformatics metrics

Fig. 4

A decision tree model to predict QC pass/fail based on pre-sequencing lab metrics. QC pass and fail refer to sample status defined by bioinformatics metrics; QC failed samples were those excluded from the final dataset. a Parameter tuning based on repeated cross validation using grid search with 10 choices of complexity parameter. Complexity parameter with the highest cross-validation accuracy was used to build the final model. b Decision tree diagram with branches indicating specific cutoffs based on pre-sequencing metrics that were predictive of the qc pass/fail status. Samples with RNA qubit higher than 25 ng/ul and pre-capture library qubit higher than 1.7 ng/ul shows the best RNA-seq data quality. There are three values in each box/node. The upper value (PASS/FAIL) in each box indicates the predicted qc status based on pre-sequencing lab metrics at each branch of decision tree. The middle number in each box indicates the ratio of qc-pass samples as defined by bioinformatics metrics. The bottom number in each box indicates the percentage of total number of samples within each box. The lower panel indicates a heatmap of the three metrics (number of gene mapped reads, number of detected genes with TPM higher than 4, sample-wise median correlation) that were used to define QC status. The upper annotation bar of the heatmap indicates the three leaf nodes predicted by the decision tree. c Relative contribution/influence of the pre-sequencing lab metrics in building the final model

Back to article page