Skip to main content

Table 5 Highest AUC obtained by the genotype and the haplotype-based approaches

From: A comparison of genomic profiles of complex diseases under different models

  Genotype-based, holdout approach Haplotype-based, holdout approach
  p-value filtering Top 100 SNPs  
Disease AUC Learning p-value AUC Learning AUC Learning Haplotype Threshold
   machine threshold   machine   machine length p-value
BD 0.6222 LR wGRS 15e-2 0.553 RR 0.6873 AdaBoostM1-add. 3 1e-4
CAD 0.611 20RF 1e-5 0.582 sSVM 0.5761 20RF-rec. 3 1e-7
HT 0.5776 AdaBoostM1 15e-2 0.559 lasso 0.5573 NBC-all 5 1e-5
IBD 0.6136 AdaBoostM1 1e-5 0.587 RR 0.6213 AdaBoostM1-rec. 2 1e-5
RA 0.8152 AdaBoostM1 1e-5 0.736 GLM AIC 0.8024 AdaBoostM1-add. 2 1e-5
T1D 0.8615 AdaBoostM1 1e-5 0.793 GLM AIC 0.8682 AdaBoostM1-add. 3 1e-6
T2D 0.6134 AdaBoostM1 1e-3 0.576 LR wGRS 0.6372 AdaBoostM1-add. 2 1e-4
  1. The highest AUC was obtained by the genotype (column 2) and the haplotype-based approaches (column 7) using the same multisampling method for both approaches: holdout. The learning machines used for the haplotype-based approaches include the genetic model used: additive (add.), recessive (rec.), dominant (dom.) or each model returns the same result (all.). Column 5 shows the highest AUC for the genotype approach when the number of input variables is reduced to the top 100 SNPs in order to use the time-consuming generalized linear models