Skip to main content
Fig. 3 | BMC Medical Genomics

Fig. 3

From: Predicting gene knockout effects from expression data

Fig. 3

Machine learning models capture various expression-essentiality relationships. A Histogram of the number of genes having the specified (X-axis) Pearson correlation. Blue bars represent using a linear model, red bars use XGBoost, green bars use deep learning, and purple bars use a Gaussian Process. B Comparison of Pearson correlation predictions of model predictions in work presented here (Y-axis) vs. using the model of Itzhacky et al. [11] (x-axis) on a curated list of disease-associated genes ([20]. CF Comparison of measured RPP25L essentiality score (X-axis) vs predicted (Y-axis). In all cases, data was split to train/validation set (blue), and held-out test set (red). Shown are: C Linear model based on the expression of RPP25 (test-set prediction r = 0.76, p < 2.5e−38). D Multiple linear regression model based on the expression of RPP25 and additional 5 covariates genes (r = 0.78, p < 2.8e−41). E Linear model using 32 covariate genes (r = 0.76, p < 1.7e−38). F Same 32 covariate genes, using a deep learning regression model combined with gradient boosting regression trees (test correlation r = 0.81, p < 8.5e−48)

Back to article page