HLA and B2M mutations in TCGA
B2M mutation calls were obtained directly from the MAF files provided by TCGA. Because the HLA locus is highly polymorphic, mutation calls against the reference genome are unreliable. Instead, we ran Polysolver [6] to simultaneously call patient-specific HLA types and detect somatic mutations affecting a patient’s HLA alleles. Out of 10,428 TCGA patients that had the necessary whole exome sequencing data, only 579 patients had an HLA mutation and 125 patients had B2M mutations. Most of these mutations were nonsynonymous (Fig. 2a). To determine whether nonsynonymous mutations occurred at amino acid residues with the potential to interfere with formation of the MHC-I molecule, experimental 3D structures for the B2M-HLA complex were obtained from the Protein Data Bank [11] and used to annotate amino acid residue location at protein core, surface or at the physical interface between B2M and HLA encoded proteins (Methods).
Mutations on HLA proteins, particularly HLA-A, showed a biased distribution with several recurrent hotspots (Fig. 2b). Mutations were most concentrated in the α3 domain that mediates interaction with the T cell receptor (TCR) (206 mutations, 40.63% of total; OR = 2.04, p < 2.58e-09), and included multiple recurrent hotspots. Fifty-one mutations (10.06%) were observed in the transmembrane domain including additional hotspots. Although mutations were observed throughout the α1 and α2 domains that form the peptide binding groove, they tended to be less recurrent (88 mutations, 17.36% for both α1 and α2). This may reflect the much larger heterogeneity of this region across HLA alleles.
Recurrent hotspot mutations often targeted interface and core regions on HLA-A, while they targeted core and surface regions on HLA-B, and surface regions on HLA-C (Fig. 2b). Since there are many alleles for each HLA protein, we used the consensus of residue annotations across different alleles to annotate each HLA protein (Additional file 1: Figure S1). Even though the annotations for most frequently mutated residues were in agreement between different HLA alleles, there were some exceptions, including residue 231 on HLA-A. Although residue 231 (R231) on HLA-A was annotated as surface based on the consensus across HLA-A alleles, the residue is located very close to the interface region (Fig. 2b) and in fact was predicted as an interface residue on 2 of the 6 HLA-A allele structures analyzed. Additionally, although residue 209 (R209) on HLA-A and HLA-B proteins was annotated as ambiguous due to its intermediate value of relative solvent accessible surface area (RSA) for most HLA-A/B structures analyzed, the average RSA across structures is close to the threshold for core annotation (7.17), and R209 was indeed annotated as core in some of them. Overall, the distribution of HLA mutations for the three proteins was consistent with the previous report by Shukla et al. [6], though the current analysis incorporates an overall larger number of samples. Mutations in B2M were largely loss of function (Fig. 2a) and more broadly distributed (Fig. 2c), as expected for a tumor suppressor gene, though several positions were also recurrently mutated.
Expected effects of B2M versus HLA mutations on MHC-I composition
Since B2M is an essential component of all MHC-I molecules, loss of B2M should equally impact MHC-I molecules derived from different HLA alleles. The B2M interface with HLA alleles is shared across the different alleles (Additional file 2: Figure S2), so mutations at this interface are also likely to affect all variants of an individual’s MHC-I molecule, although complexes involving B2M and binding partners that use alternative interfaces should not be affected. In contrast, loss of function or interface mutations affecting a specific HLA allele would only affect the MHC-I molecules derived from that allele. Thus, we speculate that B2M mutations are likely to reduce the total amount of MHC-I molecules presenting antigens on the tumor cell surface, while HLA mutations would impact which mutations could be presented as neoantigens.
Mutations in MHC-I proteins are associated with increased mutation burden
We hypothesized that both B2M and HLA mutations would affect MHC-I presentation of mutations. Mice with total lack of B2M express little if any cell surface MHC-I and lack cytotoxic CD8+ T Cells [12, 13]. In human lung cancers, an association was found between higher somatic mutation burden and HLA loss of heterozygosity [9]. If somatic mutations to HLA and B2M similarly impair antigen presentation, we would expect to see an increased mutation burden when comparing to unmutated patients.
We first analyzed 9055 TCGA patients across 31 solid tumor types that had both exome and RNA sequencing data (Fig. 3a), removing patients that had synonymous B2M or HLA mutations. We then performed a cancer-specific analysis of 3514 patients across 8 solid tumor types with at least 5 somatic B2M and HLA mutations (Fig. 3b, Additional file 3: Figure S3A). To determine whether somatic mutations to B2M and HLA were associated with an overall higher mutation burden, we compared the total number of expressed nonsynonymous mutations in patients with and without nonsynonymous somatic B2M or HLA mutations. Overall, we observed that both patients with a B2M and an HLA mutation had significantly higher tumor mutation burdens (Mann Whitney test, B2M p < 1.1e-20 and HLA p < 1.1e-30) than patients without (Fig. 3a). Pan-cancer, B2M mutated patients also had significantly higher mutation burdens than HLA mutated patients (Mann Whitney test, p < 0.0028). There were approximately equal numbers of early stage (I & II) and late stage (III & IV) tumors in these three groups (Additional file 4: Figure S4). We repeated the pan-cancer mutational burden analysis with Cancer Cell Line Encyclopedia (CCLE) data for 25 B2M-mutated cell lines, 114 HLA cell lines, and 1381 non-mutated cell lines, and observed the same trend: cell lines with B2M and HLA mutations had significantly higher overall mutational burden than cell lines without (Additional file 5: Figure S5). When we analyzed tumors by tissue type, we observed that certain cancers (stomach adenocarcinoma, endometrial cancer, colorectal cancer, lung adenocarcinoma, and cervical cancer) also had significantly higher mutational burden in mutated patients (Fig. 3b). Stomach, uterine and colorectal cancers have documented high rates of microsatellite instability (MSI), thus we evaluated whether B2M and HLA mutations were biased to occur in high MSI tumors. Using MSI annotations available for 10,415 patients from Kautto et al. [14], we found a significant bias for B2M and HLA mutations to occur in patients with MSI (Fisher’s exact test; B2M OR = 14.66, p < 8.7e-24; HLA OR = 6.28, p < 2.0e-36). To rule out the possibility that MSI was solely driving our results, we reanalyzed the mutational burden between B2M and HLA mutated and unmutated patients, this time retaining only 8668 microsatellite stable (MSS) patients. Interestingly, we found similar trends in elevated mutational burden associated with B2M and HLA mutation (Fig. 3c, d, Additional file 3: Figure S3B), and consequently focused on MSS patients only in the subsequent analyses. Thus, even in MSS tumors, B2M and HLA mutations are associated with an increased nonsynonymous mutational burden.
Mutations in MHC-I proteins are associated with increased binding neoantigen counts
To obtain more evidence as to whether the elevated mutation counts observed in HLA and B2M mutated patients were a result of the mutation, or vice versa, we compared the fraction of mutations likely to generate neoantigens across MSS patients with and without B2M and HLA mutations. We speculated that if B2M and HLA mutations are an artifact of higher mutation rates, the proportion of mutations that generate neoantigens should not differ relative to patients without such mutations. However, if these mutations truly facilitate immune escape, neoantigens should be enriched among the observed mutations.
Using HLA allele genotypes called by Polysolver [6], we calculated patient-specific MHC-I presentation scores for all expressed mutations observed in each patient’s tumor [8, 15]. We previously demonstrated that these affinity-based presentation scores, called PHBR-I scores, can distinguish peptides found in complex with MHC-I on the cell surface in mass spectrometry experiments from random peptides simulated from the human proteome, supporting that affinity is a reasonable proxy for cell surface presentation [8]. Indeed, when we looked at the fraction of expressed mutations considered to be neoantigens at various PHBR-I cutoffs, we found that at any given cutoff, a higher fraction of mutations represented neoantigens in both B2M and HLA mutated patients (Figs. 4a, b). This corresponded to overall higher numbers of neoantigens in B2M and HLA mutant tumors (Additional file 6: Figure S6). The higher overall proportion of neoantigens is consistent with both somatic B2M and HLA mutations impairing presentation of neoantigens for immune surveillance.
Assessing bias in neoantigen affinities in patients with mutant HLA alleles
McGranahan et al. reported that in lung cancer, subclones that had lost a particular HLA allele tended to accumulate mutations with higher affinity for the lost allele, suggesting that such mutations were no longer subject to immunoediting [9]. We therefore sought to assess whether mutations accumulating in tumors with HLA mutations showed a bias in affinity toward the affected HLA allele. We first evaluated whether the number of mutant-allele specific mutations in these patients was higher than the average number of mutations specific to each of the other alleles (Fig. 4c). We observed several patients for which the number of mutant-allele specific mutations was indeed higher (Fig. 4c; red lines). We note that the current study design differs from the study by McGranahan et al. in that we do not have subclone-specific sequencing data, and thus can not determine which mutations occurred in the same cell population as the mutated HLA allele. We also did not consider allele-specific deletion events, and thus the assumption that the other 5 HLA alleles are intact may be incorrect for some patients.
Timing of somatic mutations in MHC-I proteins
To better understand B2M and HLA mutation timelines, we analyzed the tumor allelic fraction of expressed mutations for all patients. Early clonal mutations are present in a larger fraction of cancer cells than later subclonal mutations and are, therefore, expected to be present in a higher fraction of the reads generated from that site during tumor sequencing. Although this assumption can be complicated by sampling bias and genomic instability of tumors, we nonetheless expect that somatic point mutations with higher read support will in general have occurred at earlier time points than those with lower read support. Since each individual’s tumor is unique, we quantified B2M and HLA mutations in terms of their allelic fraction percentile relative to other mutations observed in the same tumor (Fig. 4d). Interestingly, B2M mutations tended to be present at higher percentiles than most HLA mutations, suggesting that B2M mutations might occur earlier in tumor development and affect a higher proportion of tumor cells. Most HLA mutations had low percentiles, suggesting these were late, subclonal events, while a subset had high percentiles and likely occurred early during tumor development in those individuals. This observation agrees with the previous report by McGranahan et al. that found HLA loss in lung cancer to be predominantly subclonal with a few observations of clonal loss noted. Patients with MSI tended to have HLA mutations with higher variant allele fraction (VAF) (Fisher’s exact test, OR = 73.3, p < 8.1e-16). These findings remained even when we considered only mutations in regions unaffected by copy number changes which can confound VAF estimates (Additional file 7: Figure S7). Interestingly, we found that tumors with early HLA mutations had significantly higher levels of neoantigens predicted to specifically bind to the mutated allele than tumors with late HLA mutations (Fig. 4e). When we evaluated the bias in specificity of neoantigens for the mutated allele in patients with early HLA loss, we found a significant difference in the number of binding neoantigens between the mutated HLA allele and average of unmutated HLA alleles (Fig. 4f). We conclude that somatic B2M and HLA mutations are associated with an overall higher burden of neoantigens, supporting the notion that these mutations facilitate tumor immune escape.
Correlation of B2M versus HLA mutation with immune cell infiltration and cytotoxicity
Effective antigen presentation via MHC-I is associated with CD8+ T cell driven cytotoxicity. Furthermore, cell surface MHC-I molecules deliver an inhibitory signal to natural killer (NK) cells. Thus, changes to cell surface presentation of neoantigens by MHC-I due to mutations in B2M and HLA may be reflected in immune cell infiltration levels and levels of cytotoxicity. We quantified immune cell infiltration from tumor RNA sequencing data using Cibersort [16] and levels of cytotoxicity using the score proposed by Rooney et al. [17]. While Shukla et al. previously evaluated immune infiltrates and cytotoxicity in the context of somatic HLA mutations, to our knowledge B2M mutations have not previously been analyzed in this context [6].
CD8+ T cell levels were elevated in tumors with HLA mutations, both pan-cancer (Fig. 5a) and in several tumor types (Fig. 5b, Additional file 8: Figure S8A). A possible explanation is that CD8+ T cells are primed in secondary lymphoid organs and travel to the tumor where they accumulate due to the lack of the corresponding MHC-I molecule / peptide complex. NK cell levels were elevated in tumors with B2M mutations pan-cancer (Fig. 5c), however the levels were not significantly different in any given tumor type (Fig. 5d, Additional file 8: Figure S8B). Loss of B2M resulting in reduced cell surface MHC-I molecules should reduce the ability of tumor cells to inhibit NK cell driven cytotoxicity, however it is unclear whether this would affect NK cell levels in the tumor. Cytotoxicity was elevated in both HLA and B2M mutant tumors pan-cancer (Fig. 5e) and in several tumor types (Fig. 5f, Additional file 8: Figure S8C). These trends are consistent with the idea that mutations are a mechanism of escape from immune surveillance, as previously suggested by Shukla et al. for HLA mutations [6].