Skip to main content

Table 1 List of GIAB datasets used for NeoMutate benchmarking. The data was downloaded in BAM format and converted back to fastq in order to fully test all the functionalities of the workflow. (WES: whole-exome sequencing; PE: paired-end)

From: NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer

Sample ID

Lab

Project

Library type

Read length

Insert size

File format

Number of reads before trimming

Number of reads after trimming

% kept reads

NA12878

Broad Institute

CEU Trio Analysis (son)

WES, PE

76 bp

155 bp

BAM

118,969,048

89,151,231

74.94

NA12891

Broad Institute

CEU Trio Analysis (father)

WES, PE

76 bp

155 bp

BAM

116,639,621

88,079,244

75.51

NA24631

Oslo University Hospital

Asian (Han chinese) Trio (son)

WES, PE

125 bp

202 bp

BAM

61,001,625

60,852,682

99.76