Skip to main content

Table 5 Highest AUC obtained by the genotype and the haplotype-based approaches

From: A comparison of genomic profiles of complex diseases under different models

 

Genotype-based, holdout approach

Haplotype-based, holdout approach

 

p-value filtering

Top 100 SNPs

 

Disease

AUC

Learning

p-value

AUC

Learning

AUC

Learning

Haplotype

Threshold

  

machine

threshold

 

machine

 

machine

length

p-value

BD

0.6222

LR wGRS

15e-2

0.553

RR

0.6873

AdaBoostM1-add.

3

1e-4

CAD

0.611

20RF

1e-5

0.582

sSVM

0.5761

20RF-rec.

3

1e-7

HT

0.5776

AdaBoostM1

15e-2

0.559

lasso

0.5573

NBC-all

5

1e-5

IBD

0.6136

AdaBoostM1

1e-5

0.587

RR

0.6213

AdaBoostM1-rec.

2

1e-5

RA

0.8152

AdaBoostM1

1e-5

0.736

GLM AIC

0.8024

AdaBoostM1-add.

2

1e-5

T1D

0.8615

AdaBoostM1

1e-5

0.793

GLM AIC

0.8682

AdaBoostM1-add.

3

1e-6

T2D

0.6134

AdaBoostM1

1e-3

0.576

LR wGRS

0.6372

AdaBoostM1-add.

2

1e-4

  1. The highest AUC was obtained by the genotype (column 2) and the haplotype-based approaches (column 7) using the same multisampling method for both approaches: holdout. The learning machines used for the haplotype-based approaches include the genetic model used: additive (add.), recessive (rec.), dominant (dom.) or each model returns the same result (all.). Column 5 shows the highest AUC for the genotype approach when the number of input variables is reduced to the top 100 SNPs in order to use the time-consuming generalized linear models