Skip to main content

Table 1 List of GIAB datasets used for NeoMutate benchmarking. The data was downloaded in BAM format and converted back to fastq in order to fully test all the functionalities of the workflow. (WES: whole-exome sequencing; PE: paired-end)

From: NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer

Sample ID Lab Project Library type Read length Insert size File format Number of reads before trimming Number of reads after trimming % kept reads
NA12878 Broad Institute CEU Trio Analysis (son) WES, PE 76 bp 155 bp BAM 118,969,048 89,151,231 74.94
NA12891 Broad Institute CEU Trio Analysis (father) WES, PE 76 bp 155 bp BAM 116,639,621 88,079,244 75.51
NA24631 Oslo University Hospital Asian (Han chinese) Trio (son) WES, PE 125 bp 202 bp BAM 61,001,625 60,852,682 99.76