Improving lung cancer risk stratification leveraging whole transcriptome RNA sequencing and machine learning across multiple cohorts

Table 3 Lung cancer genomic sequencing classifier validation performance (Down, up classification). ^*Cancer prevalence calculation includes local benign subjects as Prevalence = \( \frac{\# Malignant}{\# Malignant+\# Benign+\# Local\ Benign} \). The local benign subjects had local label as benign but did not have an adjudicated label. NPV, PPV and % Reclassified are all functions of prevalence (estimated including local benign subjects), sensitivity and specificity (both estimated excluding local benign subjects)

AUC	Pre-test Cancer Risk	^*Cancer prevalence	Cancer risk re-stratification	Specificity	Sensitivity	Post-test NPV/PPV	^§% Re-stratified
73.4% [68.3–78.4]	Low	5%	Low to Very Low	57.4% [44.8–69.3]	100% [39.8–100]	100% NPV [91.0–100]	54.5%
	Intermediate	28.2%	Intermediate to Low	37.3% [27.9–47.4]	90.6% [79.3–96.9]	91.0% NPV [80.8–96.0]	29.4%
	Intermediate	28.2%	Intermediate to High	94.1% [87.6–97.8]	28.3% [16.8–42.3]	65.4% PPV [43.8–82.1]	12.2%
	High	73.6%	High to Very High	91.2% [76.3–98.1]	34.0% [25.0–43.8]	91.5% PPV [77.9–97.0]	27.3%

^§% Reclassified (Low to Very Low, Intermediate to Low) = (1- Prevalence) specificity + Prevalence (1-sensitivity)
^§% Reclassified (Intermediate to High, High to Very High) = Prevalence sensitivity + (1-Prevalence) (1- specificity)
^* There are 8, 33 and 4 local benign subjects in low, intermediate and high-risk group

ISSN: 1755-8794