From: NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer
Feature name | Source | Group | Description |
---|---|---|---|
indel_or_snp | BAM | 3 | Is the given variant a SNP, insertion or deletion? |
ts_or_tv | BAM | 3 | Transition or transversion |
depth_TUM | BAM | 1 | Coverage in tumor sample for the given variant position |
alt_counts_TUM | BAM | 1 | Alternative read counts (number of reads supporting the variant) |
alt_avg_MQ_TUM | BAM | 2 | Average mapping quality of reads containing the variant. Quantification of the probability that a read is misplaced. |
alt_avg_BQ_TUM | BAM | 2 | Average base quality of the reads containing the variant. Accuracy of a base sequenced by the sequencing machine. |
alt_plus_TUM | BAM | 1 | Number of reads on the plus/forward strand supporting the variant |
alt_minus_TUM | BAM | 1 | Number of reads on the minus/reverse strand supporting the variant |
ref_plus_TUM | BAM | 1 | Number of reads on the plus/forward strand supporting the reference allele |
ref_minus_TUM | BAM | 1 | Number of reads on the minus/reverse strand supporting the reference allele |
VAF | BAM | 1 | Variant allele frequency |
depth_WT | BAM | 1 | Coverage in normal sample for the given variant position |
alt_counts_WT | BAM | 1 | Number of reads supporting the variant in normal sample (germline risk) |
ref_counts_WT | BAM | 1 | Number of reads supporting the reference in normal sample |
num_of_indels_closeby | BAM | 3 | Are there indels closeby? (false positive risk factor) |
GC_content | BAM | 3 | Number of GC bases relative to the total number of bases located + − 20 bp for the given variant position |
shannon_entropy | BAM | 3 | A mathematical measure of the degree of randomness in a set of data. The smaller the entropy value, the less complex the sequence is. |
detection_status | VCF | 4 | Classification status (“somatic” or “non somatic”) for the given variant caller |
“Tool”_F | VCF | 4 | Quality tag in FILTER column (“PASS” or “non PASS”) |
“Tool”_alt_counts | VCF | 1, 4 | Number of reads supporting the variant reported by the specific tool |
“Tool”_ref_counts | VCF | 1, 4 | Number of reads supporting the reference reported by the specific tool |