- Research article
- Open Access
- Open Peer Review
HLA and proteasome expression body map
BMC Medical Genomics volume 11, Article number: 36 (2018)
The presentation of HLA peptide complexes to T cells is a highly regulated and tissue specific process involving multiple transcriptionally controlled cellular components. The extensive polymorphism of HLA genes and the complex composition of the proteasome make it difficult to map their expression profiles across tissues.
Here we applied a tailored gene quantification pipeline to 4323 publicly available RNA-Seq datasets representing 55 normal tissues and cell types to examine expression profiles of (classical and non-classical) HLA class I, class II and proteasomal genes.
We generated the first comprehensive expression atlas of antigen presenting-related genes across 56 normal tissues and cell types, including immune cells, pancreatic islets, platelets and hematopoietic stem cells. We found a surprisingly heterogeneous HLA expression pattern with up to 100-fold difference in intra-tissue median HLA abundances. Cells of the immune system and lymphatic organs expressed the highest levels of classical HLA class I (HLA-A,-B,-C), class II (HLA-DQA1,-DQB1,-DPA1,-DPB1,-DRA,-DRB1) and non-classical HLA class I (HLA-E,-F) molecules, whereas retina, brain, muscle, megakaryocytes and erythroblasts showed the lowest abundance. In contrast, we identified a distinct and highly tissue-restricted expression pattern of the non-classical class I gene HLA-G in placenta, pancreatic islets, pituitary gland and testis. While the constitutive proteasome showed relatively constant expression across all tissues, we found the immunoproteasome to be enriched in lymphatic organs and almost absent in immune privileged tissues.
Here, we not only provide a reference catalog of tissue and cell type specific HLA expression, but also highlight extremely variable expression of the basic components of antigen processing and presentation in different cell types. Our findings indicate that low expression of classical HLA class I molecules together with lack of immunoproteasome components as well as upregulation of HLA-G may be of key relevance to maintain tolerance in immune privileged tissues.
One of the key structural elements of the mammalian adaptive immune system is the Major Histocompatibility Complex (MHC). A primary task of the classical MHC molecules is to bind and present self, abnormal self (e.g., neo-epitopes arising from mutations ) and foreign (e.g., pathogenic) peptide antigens derived from intracellular (class I) or from extracellular proteins (class II) on the surface of the cell, where the peptide-MHC complex can be screened by T lymphocytes, eventually leading to an immune response. The human MHC system, Human Leukocyte Antigen (HLA), is located on chromosome six and consists of highly polymorphic HLA class I and II genes. As these are co-dominantly expressed, every individual carries two alleles of the HLA class I and class II genes.
There is increasing evidence that diseased cells down-regulate or even loose classical HLA class I and class II expression to escape T-cell mediated killing and non-classical HLA class I molecules have been associated to play a role in cancer immunosuppression . In addition, dysregulation of components of the antigen processing and presenting machinery (APM) in cancer results in alterations of HLA expression, which might result in decreased immunogenicity and lack of T cell recognition . A crucial component of the APM is the proteasome, which is a multi-enzyme complex in the cytosol constantly degrading proteins and thus producing peptide fragments that are transported into the endoplasmatic reticulum by transporter proteins (TAP1, TAP2), where they are further trimmed, loaded onto a HLA class I molecules and transported to the cell surface via the Golgi apparatus . While PSMB5, PSMB6, PSMB7 are ubiquitously expressed components of the constitutive proteasome, the immunoproteasome is generated by overexpression of PSMB8 (LMP7), PSMB9 (LMP2) and PSMB10 (LMP10) genes upon stimulation with INFγ or TNFα. It produces a distinct pool of HLA class I epitopes that are more efficient in activating cytotoxic T cells.
Non-classical HLA class I molecules (HLA-E, -F, −G) have a similar protein structure to that of the classical HLA class I alleles and also require a bound peptide in the binding groove to form a stable complex. However, they are characterized by few allelic polymorphisms and play a role in regulating innate immune responses. Further, they have expression patterns often related to their function, such as HLA-G expression in the placenta where it interacts with maternal effector cells .
Understanding HLA and proteasome expression in physiological state is crucial for biomarker discovery, for the development of cancer vaccines targeting neo-epitopes, and for understanding immune escape mechanisms in cancer. Measuring HLA expression levels is often performed via Fluorescence-activated Cell Sorting (FACS), Immunohistochemistry (IHC) or Western Blot using antibodies recognizing HLA proteins and via real-time quantitative polymerase chain reaction (RT-qPCR) using specific primers for HLA mRNA. In contrast, next-generation sequencing (NGS) RNA-Seq , recovers millions of nucleotide sequence reads derived from a sample’s transcriptome. Standard RNA-Seq processing algorithms map these reads to a reference genome to reconstruct the transcriptome and derive the abundance of the genes. However, the reference does not reflect the polymorphic nature of the HLA system, precluding accurate HLA typing  and thus expression determination. To account for this, we developed the algorithm ‘seq2HLA’  that takes bulk RNA-Seq raw data as input and outputs the most probable 4-digit HLA class I and II types  and locus-specific expression.
Here, we extended seq2HLA to determine the HLA type and expression of non-classical HLA class I molecules. Using the updated algorithm, we analyzed 4323 public RNA-Seq datasets to generate – to our knowledge - the first comprehensive expression atlas of HLA class I (classical and non -classical) and class II gene loci across 38 non-cancer tissues and 17 different cell types, including immune cell populations, pancreatic islets, platelets, hematopoietic stem and progenitor cells. In addition, we examine the expression profile of the constitutive proteasome and immunoproteasome across the different tissues and their correlation with HLA expression.
Paired-end RNA-Seq sequence reads for 4340 samples, representing 38 tissues and 17 cell types (Additional file 1: Table S1),sequenced with the Illumina platform were downloaded from the NCBI Sequence Read Archive (SRA) (http://www.ncbi.nlm.nih.gov/sra) and European Genome-phenome Archive (EGA) hosted by the European Bioinformatics Institute (EBI) (https://www.ebi.ac.uk/ega)  (Table 1). The number of biological replicates varies between 6 and 426 (appendix and brain, respectively) (Fig. 1). Sample etiology for the Genotype-Tissue Expression (GTEx) data is described at [http://www.gtexportal.org/home/tissueSummaryPage]: sample donors were primarily white males for whom the cause of death was anything from traumatic injuries (main cause of death for younger donors) to heart disease (main cause of death for over 60 year old donors).
HLA typing using RNA-Seq data
The tool seq2HLA v2.2 (https://github.com/TRON-Bioinformatics/seq2HLA) was used to derive HLA class I (classical and non-classical) and class II types from the RNA-Seq data. Seq2HLA v2.2 previously determined 4-digit resolution HLA types with high accuracy from classical HLA class I loci . We extended seq2HLA v2.2 to automatically assign the 4-digit HLA types and determine the expression of the non-classical HLA class I genes by extending the HLA reference database with all known non-classical HLA allele sequences as described earlier . HLA gene expression is normalized according to reads per kilobase of exon model per million mapped reads (RPKM) . In addition, we extended seq2HLA v2.2 to not only detect the HLA class II molecules DQA1, DQB1 and DRB1, but also DRA, DPA1 and DPB1. As each locus (DP,DQ,DR) consist of the respective α- and β-chain, HLA class II expression is herein defined as the sum of min_rpkm(DPA1,DPB1), min_rpkm(DQA1,DQB1) and min_rpkm(DRA,DRB1). Seq2HLA v2.3 is freely available at https://github.com/TRON-Bioinformatics/seq2HLA.
Gene expression analysis
Alignment of the raw reads against the humane reference genome build hg19 and calculation of gene-level expression of UCSC known genes is performed as described previously . HLA gene expression is normalized according to reads per kilobase of exon model per million mapped reads (RPKM) . To further validate the comparability of the RNA-Seq data , we additionally normalized the read counts according to transcript per million (TPM)  and reproduced the proteasome body map (Additional file 2: Figure S7, Additional file 1: Table S4). For TPM normalization, RPKM values for each sample are converted with: TPM = RPKM/(sum of RPKM over all genes) * 10^6.
Correlation analysis is performed with R-3.1.2 for Linux. Throughout the study, the correlation coefficients are calculated according to Spearman’s rank correlation coefficient implemented in the package psych. Graphics are produced with the package ggplot2 . Differential expression analysis is performed with the R-package DESeq2  using the raw (i.e. not normalized values) read counts as input.
Expression body map of classical HLA class I and HLA class II molecules
Human tissue RNA-Seq data was retrieved from the Genotype Tissue Expression consortium  (GTEx; 3926 samples from 30 tissues), the Human Protein Atlas  (204 samples from 30 tissues), as well as 20 retina samples from the Sequence Read Archive (Table 1, Additional file 1: Table S1). These samples were processed using seq2HLA  to obtain the HLA type as well as the expression levels of HLA class I (classical and non-classical) as well as HLA class II (Methods) and the results were integrated into a comprehensive HLA expression body map (Fig. 1, Additional file 2: Figure S1, Additional file 1: Table S2).
Classical HLA class I expression
Among the tissues with the highest median classical HLA class I expression (Fig. 1) are whole blood (1210 RPKM), spleen (850 RPKM) and bone marrow (615 RPKM). Interestingly, lung (433 RPKM) and small intestine (482 RPKM) showed similar median expression levels as the lymph node (422 RPKM). We observed the lowest median HLA class I expression in retina, brain and muscle tissues (29, 38 and 43 RPKM, respectively).
HLA class II expression
Overall, we found the median HLA class II expression levels across the different tissues (4 RPKM to 267 RPKM) to be much lower than the median classical HLA class I expression levels (29 RPKM to 1210 RPKM). Appendix (267 RPKM), lymph node (210 RPKM) and spleen (187 RPKM) showed the highest median HLA class II levels. Similar to classical HLA class I, we observed the lowest median HLA class II expression in brain (4 RPKM), muscle (4 RPKM) and retina (5 RPKM).
HLA expression in anatomical substructures
For a subset of tissues, GTEx and HPA provide additional annotation regarding the anatomical substructure of the sample (e.g. amygdala in brain tissue). Analysis of HLA expression in these substructures revealed very similar median classical HLA class I and HLA class II expression levels within the same tissue (Additional file 2: Figure S1). Of note, we found a clearly separated cluster of 9 samples (HLA class I expression larger than 900 RPKM) in brain, which are all derived from different brain regions (Additional file 2: Figure S1A) of a 63-year female donor (HLA class I and II types matches in all 9 samples). We examined the expression of immune-relevant genes (Methods, Additional file 2: Figure S2) in these brain samples and found the leukocyte marker CD45 enriched in the samples of the outlier group (median: 19.8 vs 2.6 RPKM), whereas the T-cell marker CD3E was very low or not expressed in both groups (median: 1.3 vs 0.1 RPKM). In contrast, the macrophage associated molecules CD68 (median: 236 vs 4 RPKM) and CSF1 (median: 47 vs 9 RPKM), the pathogen-associated molecular patterns sensing receptor TLR2 (median: 60 vs 2 RPKM), the transcription factor STAT1 (median: 549 vs. 36 RPKM), as well as the immune checkpoint molecule ligands PD-L1 (median: 41 vs. 1.3 RPKM) and PD-L2 (median: 11 vs. 0.3 RPKM) were clearly enriched in the outlier group. Finally, while the pro-inflammatory cytokine IFNγ and the tryptophan degrading enzyme IDO1 were not detectable in any of the samples with HLA class I expression below 400 RPKM (median: 0.03 and 0.02 RPKM), we found expression of these genes in the outlier group (median: 1.1 and 62.7 RPKM). Of note, the samples of this patient were also among the HLA class II high expressing outliers in brain (Additional file 2: Figure S1B).
Expression of locus-specific HLA molecules
Classical HLA class I locus-specific expression
Examining the expression of the individual loci revealed a balanced classical HLA class I heavy chain expression within each tissue (Additional file 2: Figure S3A).
HLA class II locus-specific expression
α and β chains of the HLA class II loci HLA-DQ and HLA-DP are equally expressed, while HLA-DRA showed always slightly higher expression than HLA-DRB1. In all tissues, HLA-DR transcripts were always more abundant than HLA-DQ and HLA-DP transcripts (Additional file 2: Figure S3B).
Expression body map of non-classical HLA class I molecules
We extended ‘seq2HLA’ to also determine the HLA allele and quantity of the non-classical HLA class I genes (i.e. HLA-E, -F, −G) and applied it to the RNA-Seq samples in order to examine their physiological expression (Additional file 1: Table S2).
Analysis of HLA-G expression (Fig. 2a) showed a distinct pattern, with clear expression in placenta (median: 19 RPKM with 8 of 8 [100%] samples greater than 1 RPKM), pituitary (median: 2.5 RPKM with 96 of 123 [78%] samples greater than 1 RPKM) and testis (median: 1.1 RPKM with 113 of 204 [55%] samples greater than 1 RPKM). We confirmed the expression of HLA-G in placenta using RNA-Seq data from three different placental locations (amnion, chorion, decidua) produced in a different study  (Fig. 2b). Of note, the majority of lung samples showed a lack of HLA-G (median expression across the cohort: 0.35 RPKM). However, we observed 29 of 161 (18%) lung samples with an expression greater than 1 RPKM. Similarly, colon is showing eight striking outliers with HLA-G expression ranging between 1 and 22 RPKM. Although two rectum samples showed HLA-G abundance greater than 1 RPKM, the total number (n = 4) is too low to draw any conclusions.
We found HLA-E transcripts expressed in all tissues with retina (5.6 RPKM), testis (8.6 RPKM), muscle, brain (both 13 RPKM) and pancreas (17.7 RPKM) showing the lowest median abundance. In contrast, we found highest median expression in blood (275 RPKM), spleen (187 RPKM), lung (142 RPKM) and adipose tissue (141 RPKM) (Additional file 2: Figure S4A, Additional file 1: Table S2).
HLA-F showed a similar, but lower expression profile to that of HLA-E (Additional file 2: Figure S4B, Additional file 1: Table S2). Again retina (0.6 RPKM), muscle (2.8 RPKM), brain, pancreas (both 3.4 RPKM) and testis (4.8 RPKM) had the lowest expression, while spleen (119 RPKM) and whole blood (50 RPKM) showed the highest abundance of HLA-F transcripts.
HLA expression in immune cell populations, platelets and pancreatic islets
We analyzed RNA-Seq data of six different immune cell populations as well as peripheral blood monocytes (PBMCs) from volunteers prior to Influenza vaccination , as well as RNA-Seq data from neutrophils  (Fig. 3, Additional file 1: Table S5). High levels of classical HLA class I (median: 1148 RPKM – 3194 RPKM) were found in all immune cells, with neutrophils showing the highest median classical HLA class I and HLA-E expression (3194 RPKM and 843 RPKM respectively). The largest median HLA class II expression was observed in B cells (696 RPKM), myeloid dendritic cells (mDCs, 639 RPKM) and monocytes (323 RPKM), whereas HLA class II transcripts were almost absent in neutrophils (3 RPKM) and low abundant in T cells (10 RPKM). HLA-E (363–899 RPKM), and to a lesser extent HLA-F (110–210 RPKM) transcripts were ubiquitously found among all immune cell types, whereas HLA-G was not detected at all (0.05–0.5 RPKM). We also analyzed RNA-Seq data of human pancreatic islets  and found high classical HLA class I expression levels (median: 1111 RPKM), comparable with B cells, as well as much lower HLA class II transcript abundance (median: 38 RPKM) similar to NK cells. Again, we detected HLA-E and to a lesser extent HLA-F transcripts (140 RPKM and 55 RPKM). Of note, we found HLA-G to be expressed with a range of 5 RPKM and 27 RPKM (median: 9 RPKM) in pancreatic islets. Finally, we examined HLA expression in human platelets from four healthy blood donors . Platelets showed the lowest classical HLA class I expression levels of all examined purified cell types (median: 192 RPKM), comparable amount of HLA-E (median: 182 RPKM), no HLA class II, no HLA-G and very low levels of HLA-F (median: 2.4 RPKM) transcripts.
Human stem and progenitor cells
Very recently, two different projects sequenced human blood progenitor cells [23, 24]. We analyzed and integrated both datasets to examine HLA expression during hematopoietic development (Fig. 4, Additional file 1: Table S6). Hematopoietic stem cells (HSC) showed comparable median classical HLA class I, class II and HLA-E levels (122, 177 and 120 RPKM, respectively). They differentiate into multipotent progenitor (MMP) cells, which showed lower median expression levels of classical HLA class I (81 RPKM), class II (96 RPKM) and much lower median abundance of HLA-E (40 RPKM). Common myeloid progenitor (CMP) cells showed low levels of HLA class I (median: 20 RPKM), HLA class II (median: 35 RPKM) and HLA-E (median: 20 RPKM). CMP cells differentiate into Megakaryocyte–erythroid progenitor (MEP), which showed a comparably low median HLA expression pattern (HLA class I: 22 RPKM, HLA class II: 33 RPKM, HLA-E: 26 RPKM). Megakaryocytes (MK) and erythroblasts (EB) showed even lower median HLA class I (6 and 3 RPKM, respectively) and HLA class II expression levels (5 and 3 RPKM respectively). While HLA-E is also very lowly expressed in EBs, we found higher expression levels in MKs (median: 41 RPKM), comparable to MMP cells. CMP cells can also differentiate into granulocyte monocyte progenitors (GMP), which showed a higher median classical HLA Class I (61 RPKM) and HLA class II (78 RPKM) expression profile, as well as similar HLA-E expression (median: 26 RPKM) compared to its progenitor. Finally, common lymphoid progenitor (CLP) cells showed similar median HLA expression profiles (classical HLA class I: 69 RPKM, HLA class II: 113 RPKM, HLA-E: 18 RPKM) compared to their progenitor cells (MMP). While ubiquitously abundant in all examined tissues and immune cells, HLA-F was only detected in HSCs and HLA-G was absent in all progenitor cells.
Expression bodymap of constitutive proteasome and immunoproteasome
In addition to the HLA class I (classical and non-classical) and HLA class II body map, we examined the expression profiles of proteasome components (PSMB5–10) across the 38 different tissues. The abundance of the constitutive proteasome is approximated by the median expression values of its three proteolytic subunits PSMB5, PSMB6 and PSMB7. Similarly, the immunoproteasome is represented by the median expression of the genes PSMB8 (LMP7), PSMB9 (LMP2), PSMB10 (LMP10). The constitutive proteasome was found in all tissues with blood (17 RPKM), bone marrow (38 RPKM) and pancreas (39 RPKM) having the lowest and placenta (91 RPKM), muscle (93 RPKM) and adrenal gland (103 RPKM) having the highest expression. In contrast, the immunoproteasome showed a higher variance. We detected very low expression of PSMB8 (LMP7), PSMB9 (LMP2) and PSMB10 (LMP10) in immune privileged tissues such as retina (2 RPKM), muscle (3 RPKM), testis (3.6 RPKM), whereas lymphoid tissues including spleen (101 RPKM), appendix (109 RPKM) and lymph node (129 RPKM) displayed highly abundant immunoproteasome levels (Fig. 5, Additional file 1: Table S3). These findings can be reproduced when normalizing the gene expression values according to transcripts per million (TPM, Additional file 2: Figure S7, Additional file 1: Table S4). Interestingly, expression profile of TAP1 and TAP2 resembles that of the immunoproteasome: retina (TAP1: 6.3 RPKM, TAP2: 5.9 RPKM), muscle (TAP1: 10.2 RPKM, TAP2: 7.2 RPKM) and brain (TAP1: 10.6 RPKM, TAP2: 10.1 RPKM) are amongst the tissues with the lowest amount of transcript abundance, whereas spleen (TAP1: 117.3 PRKM, TAP2: 60.2 RPKM), appendix (TAP1: 138.1 RPKM, TAP2: 98.8 RPKM) and lymph node (TAP1: 143 RPKM, TAP2: 112.2 RPKM) show the highest expression (Additional file 2: Figure S8A, Additional file 1: Table S8).
By analyzing 4340 public RNA-Seq datasets from 38 human tissues and 20 different hematopoietic cell populations, we generated a comprehensive expression atlas of classical (HLA-A, HLA-B, HLA-C) and non-classical (HLA-G,-E,-F) class I and class II (HLA-DPA1,-DPB1,-DQA1-DQB1,-DRA, -DRB1) HLA molecules.
Classical HLA class I expression
We observed an extraordinary heterogeneity in median classical HLA class I expression levels between the tissues. Amongst the organs with the highest expression are spleen and bone marrow. They are key components of the lymphatic system accumulating immune cells, such as antigen presenting cells (APCs) in spleen and developing lymphocytes in bone marrow. The largest median abundance of classical HLA class I transcripts, however, can be found in whole blood. This might be explained by its cell type composition with neutrophils representing the largest population of nucleated cells (30–80%) . We found neutrophils to have the highest classical HLA class I expression compared to other immune cells. Of note, we found classical HLA class I (and class II) transcripts highly abundant in lung and small intestine. This can be explained by the fact that the lung is exposed to the environment dealing first line with pathogens and that the small intestine contains lymphoid nodules (Peyer’s patches), which are part of the lymphatic system. In contrast, we observed the lowest classical HLA class I expression in brain, retina and testis, which are considered immune privileged [26,27,28] and muscle, which is described as an immunologically unique tissue . Examining locus-specific expression, we found balanced expression of the individual classical HLA class I loci (HLA-A, -B, and –C) within each tissue. This result is in concordance with findings in human cell lines , but this is the first time that it has been established across a large collection of diverse human tissues. Although the density of HLA-C molecules on the cell surface has been reported to be much lower compared to HLA-A and HLA-B molecules [31, 32], we found that they are equally abundant on transcript level.
We also detected inter-individual variation of HLA expression within samples derived from the same organ. For example, we found that some brain samples showed a high-level transcription of HLA genes giving rise large variances in classical HLA class I expression (ranging between 0.4 RPKM and 1763 RPKM). To explore this in more detail we analyzed a group of outlier samples (n = 9) derived from different brain regions (Additional file 2: Figure S2) of the same patient showing an extremely high HLA class I expression > 900 RPKM. The samples were derived from post-mortem sections of a 63-year old woman who died of renal failure as consequence of an advanced chronic liver disease (hepatorenal syndrome, HS). Analysis of cell type specific markers in the RNA-Seq data of these samples showed highly abundant expression of CD45 and CD68 and upregulation of inflammatory markers such as interferon-γ (IFNγ) in agreement with a neuro-inflammation , probably mediated by the HS associated functional derangement in the blood-brain barrier.
HLA class II expression
Across all tissues, we observed a lower HLA class II expression compared to classical HLA class I levels. Under physiological conditions HLA class II expression is limited to specialized cell types such as B cells, monocytes, macrophages and dendritic cells that constitutively express large amounts of HLA class II transcripts (Fig. 3). Accordingly, we detected the highest class II expression in lymphoid tissues, whereas immune-privileged organs retina and brain showed the lowest median HLA class II expression. We also found muscle to express low amounts of HLA class II transcripts, which is in concordance with earlier studies reporting HLA class II molecules on myoblasts  acting as antigen presenting cells .
Our study also confirmed an imbalanced HLA class II locus expression: HLA-DR transcripts were always expressed higher than HLA-DQ and HLA-DP. Each HLA class II molecule consists of the α- and the β-chain and we found balanced expression of these chains within each locus, with the notable exception of HLA-DR, for which we always found higher HLA-DRA abundance compared to HLA-DRB1.
Expression of non-classical HLAs
In our study, we advanced seq2HLA  to determine the identity and expression of non-classical HLA class I molecules (i.e. HLA-E, -F, −G).
In contrast to classical HLA class I molecules, HLA-G is involved in mediating immune tolerance by inhibiting proliferation and cytotoxicity of immune-competent effector cells, such as monocytes, NK cells, or T cells . Physiologic HLA-G expression has been found in placental cytotrophoblasts contributing to the maternal tolerance toward the fetus , in testicular tissue playing a significant role in human spermatogenesis , and in pancreatic islets inhibiting autoimmune damage during insulin exocytosis . Indeed, we found a tissue-specific tissue distribution of HLA-G with expression only in those three body regions, additionally in pituitary gland (Fig. 2), which has not been reported before and in pancreatic islets (Fig. 3) Like pancreatic islets, the pituitary gland is part of the endocrine system secreting hormones, a process that produces potentially immunogenic targets that might trigger autoreactive effector cells. Of note, both structures showed higher median classical HLA class I expression compared to the organ they belong to: pituitary gland (154 RPKM) as part of the brain (35 RPKM) and pancreatic islets (1110 RPKM) are located in the pancreas (58 RPKM).
Together our similar findings in pancreatic islets and pituitary gland i) suggest similar mechanisms in counter-acting high classical HLA class I expression, ii) suggest the prevention of autoimmunity by downregulating effector functions of potential auto-reactive lymphocytes and iii) further strengthens the hypothesis that HLA-G expression of endocrine cells might be correlated to their secretory activity .
Furthermore, we observed presence of HLA-G transcripts (greater than 1 RPKM) in 29 of 161 (18%) lung samples. This is in concordance with findings, that HLA-G is expressed by lung macrophages and dendritic cells in 25% of patients with nonmalignant respiratory diseases . In all other examined tissues and immune cell populations, we did not find evidence of HLA-G expression, with exception of striking outliers in colon. This highly tissue-specific expression profile and its immune-suppressive function makes HLA-G an interesting target, especially for cancer research .
HLA-E is the least polymorphic non-classical HLA class I gene and was first described to bind the activating and inhibitory NK cell receptors (namely CD94/NKG2C and CD94/NKG2A), suggesting a regulatory function to NK cells . In contrast, we found HLA-E broadly expressed, which is in line with studies identifying HLA-E in a variety of cells and tissues on mRNA  and protein levels . HLA-F is the least studied non-classical HLA class I gene. Like HLA-E, it is also thought to mediate important immunosuppressive functions , but unlike HLA-G, its significance in cancers is not yet established [44, 45]. HLA-F showed an expression profile similar to HLA-E with regard to tissue distribution, but lower in intensity. Of note, the expression patterns of both molecules resemble those of classical HLA molecules with immune privileged tissues showing the lowest and lymphatic organs the highest expression of classical HLA class I, class II, HLA-E and HLA-F transcripts, suggesting common regulatory elements. Indeed, transcription of HLA-E and HLA-F is induced by class II transactivator (CIITA) [36, 46], which was first described as the master regulator of HLA class II genes  and was also found to play a role in the regulation of HLA class I transcription . In contrast, HLA-G is not regulated by CIITA, implying a different and unique transcriptional regulation amongst the non-classical HLA class I genes  resulting in the distinct, tissue-specific expression pattern. By analyzing correlations between CIITA and HLA expression across tissues, we can reproduce these findings (Additional file 2: Figure S6).
In immune cell populations, we found HLA-E to be expressed by B cells, T cells, NK cells, monocytes and mDCs, further corroborating previous findings [49,50,51]. We see the highest HLA-E expression in neutrophils, which also exhibit the highest classical HLA class I abundance, further supporting the transcriptional relationship between those molecules, even in individual cell types. HLA-F is also expressed ubiquitously in all examined immune cells. Interestingly, HLA-F has been recently found to be expressed on activated lymphocytes while being absent in resting lymphocytes .
Stem and progenitor cells
Stem and progenitor cells regulate homeostasis e.g. of blood cells or tissue regeneration. Understanding stem cell differentiation is crucial for examining the underlying mechanisms of development, regeneration of damaged tissues and for the development of therapeutic approaches against various diseases. Thus, we analyzed RNA-Seq data from two different projects and found consistently, that hematopoietic stem cells (HSCs) express classical HLA class I and class II transcripts in equal amounts. This finding strengthens the important role of not only classical HLA class I, but also HLA class II type matching of donor and host in allogeneic HSC transplantations .
Together, these data show that HLA transcript expression decreases during hematopoietic stem cell differentiation following the myeloid and subsequently the megakaryocyte-erythroid lineages, which give rise to cells with no HLA molecules (in case of erythroblasts differentiating to erythrocytes) or very low amount of classical HLA class I (in case of platelets). In contrast, expression of HLA transcripts stays constant and is upregulated following the lineages resulting in immune cells.
HLA and proteasome expression
The proteolytic activity of the constitutive proteasome is mediated through its components β1 (PSMB6), β2 (PSMB7) and β5 (PSMB5) that are characterized by different cleavage specificities. Upon stimulation with INFγ or TNFα, the immunoproteasome is generated by upregulating the inducible subunits β1i (PSMB9, LMP2), β2i (PSMB10, LMP10) and β5i (PSMB8, LMP7). While the constitutive proteasome is ubiquitously expressed in all tissues with no obvious relation to HLA expression, the median abundance of immunoproteasomal components strongly correlates with median HLA class II (R = 0.89) and classical HLA class I (R = 0.62) expression. This correlation suggests a common regulatory mechanism. Indeed, the immunoproteasome as well as classical HLA expression is induced upon stimulation with INFγ or TNFα and we found the median expression of both cytokines associated with the immunoproteasomal components (Additional file 2: Figure S7). Of note, PSMB9 (LMP2) shares a common bidirectional promotor with TAP1 (Transporter associated with Antigen Processing 1) suggesting a coordinate regulation . Analysis of mRNA abundance of both genes within individuals across the different tissues reveals a strong correlation (R = 0.85), further corroborating the link between PSMB9 (LMP2) and TAP1 expression (Additional file 2: Figure S8B).
This comprehensive expression body map provides a reference catalog of HLA expression profiles in normal human tissues and cell types. We found the classical HLA class I and HLA class II molecules highly expressed in lymphatic tissues (such as lymph node and spleen) and lowest in immune privileged organs (brain, retina, testis, muscle). In addition, we defined the expression body map of two types of proteasomes: the constitutive proteasome is expressed ubiquitously with no obvious correlation to classical HLA class I and class II expression, whereas the immunoproteasome – which produces classical HLA class I epitopes that are more efficient in activating cytotoxic T cells - is enriched in tissues of the lymphatic system and almost absent in immune privileged organs.
Further development of the tool ‘seq2HLA’ allowed us to determine non-classical HLA class I expression levels. While HLA-E and HLA-F showed similar expression profiles to those of classical HLA molecules, HLA-G is exclusively present in immunologically protected (placenta) and immune privileged sites (testis), as well as sites of the endocrine system (pituitary gland, pancreatic islets).
Our findings not only provide a resource of cell type specific HLA expression, but also highlight extremely variable expression of the basic components of antigen processing and presentation in different cell types and indicate that extremely low expression of classical HLA class I molecules together with lack of immunoproteasome components as well as upregulation of HLA-G may be of key relevance to maintain tolerance in immune privileged tissues.
We anticipate, that this resource might be helpful in cancer research as baseline when examining HLA dysregulation in tumor cells as part of their immune-escape mechanisms or choosing targets for cancer immunotherapy. Especially the highly tissue-restrictive physiological expression and its immune-suppressive functions makes HLA-G an interesting molecule in cancer research: as tool for the development of universal (HLA class I negative) T cells for adoptive T cell transfer , blockade of HLA-G expression and function on tumor cells may be a potential target for antitumor therapy  or as diagnostic and prognostic biomarker . However, when using standard RNA-Seq data from a bulk tumor sample in conjunction with seq2HLA to study HLA expression by tumor cells specifically, attention must be given to the actual tumor content of the sample. Signals detected by seq2HLA (and any other gene expression quantification tools) are average values of the whole sample. One promising solution will be RNA-sequencing of single cells isolated from different tumor regions using laser capture microdissection .
Common lymphoid progenitor
Common myeloid progenitor
European Bioinformatics Institute
Fluorescence-activated cell sorting
Granulocyte monocyte progenitor
Human leukocyte antigen
Hematopoietic stem cell
Megakaryocyte erythrocyte progenitor
Major histocompatibility complex
CD34- CD41+ CD42+ megakaryocytes
National Center for Biotechnology Information
Next generation sequencing
Reads per kilobase of exon per million mapped reads
Real-time quantitative polymerase chain reaction
Sequence read archive
Transcripts per million
Kreiter S, Vormehr M, van de Roemer N, Diken M, Löwer M, Diekmann J, et al. Mutant MHC class II epitopes drive therapeutic immune responses to cancer. Nature. 2015;520:692–6. https://doi.org/10.1038/nature14426.
Kochan G, Escors D, Breckpot K, Guerrero-Setas D. Role of non-classical MHC class I molecules in cancer immunosuppression. Oncoimmunology. 2013;2:e26491. https://doi.org/10.4161/onci.26491.
Leone P, Shin E-C, Perosa F, Vacca A, Dammacco F, Racanelli V. MHC class I antigen processing and presenting machinery: organization, function, and defects in tumor cells. J Natl Cancer Inst. 2013;105:1172–87. https://doi.org/10.1093/jnci/djt184.
Neefjes J, Jongsma MLM, Paul P, Bakke O. Towards a systems understanding of MHC class I and MHC class II antigen presentation. Nat Rev Immunol. 2011;11:823–36. https://doi.org/10.1038/nri3084.
Curigliano G, Criscitiello C, Gelao L, Goldhirsch A. Molecular pathways: human leukocyte antigen G (HLA-G). Clin Cancer Res. 2013;19:5564–71. https://doi.org/10.1158/1078-0432.CCR-12-3697.
Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5:621–8. https://doi.org/10.1038/nmeth.1226.
Brandt DY, Aguiar VR, Bitarello BD, Nunes K, Goudet J, Meyer D. Mapping Bias overestimates reference allele frequencies at the HLA genes in the 1000 genomes project phase I data. G3 (Bethesda). 2015;5:931–41. https://doi.org/10.1534/g3.114.015784.
Boegel S, Löwer M, Schäfer M, Bukur T, de Graaf J, Boisguérin V, et al. HLA typing from RNA-Seq sequence reads. Genome Med. 2012;4:102. https://doi.org/10.1186/gm403.
Boegel S, Löwer M, Bukur T, Sahin U, Castle JC. A catalog of HLA type, HLA expression, and neo-epitope candidates in human cancer cell lines. Oncoimmunology. 2014;3:e954893. https://doi.org/10.4161/21624011.2014.954893.
Lappalainen I, Almeida-King J, Kumanduri V, Senf A, Spalding JD, Ur-Rehman S, et al. The European genome-phenome archive of human data consented for biomedical research. Nat Genet. 2015;47:692–5. https://doi.org/10.1038/ng.3312.
Scholtalbers J, Boegel S, Bukur T, Byl M, Goerges S, Sorn P, et al. TCLP: an online cancer cell line catalogue integrating HLA type, predicted neo-epitopes, virus and gene expression. Genome Med. 2015;7:118. https://doi.org/10.1186/s13073-015-0240-5.
‘t Hoen PA, Peter A C, Friedländer MR, Almlöf J, Sammeth M, Pulyakhina I, Anvar SY, et al. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat Biotechnol. 2013;31:1015–1022. https://doi.org/10.1038/nbt.2702.
Li B, Ruotti V, Stewart RM, Thomson JA, Dewey CN. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics. 2010;26:493–500. https://doi.org/10.1093/bioinformatics/btp692.
Wickham H. ggplot2. New York: Springer New York; 2009.
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. https://doi.org/10.1186/s13059-014-0550-8.
Ardlie KG, Deluca DS, Segre AV, Sullivan TJ, Young TR, Gelfand ET, et al. The genotype-tissue expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science. 2015;348:648–60. https://doi.org/10.1126/science.1262110.
Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Proteomics. Tissue-based map of the human proteome. Science. 2015;347:1260419. https://doi.org/10.1126/science.1260419.
Kim J, Zhao K, Jiang P, Lu Z-X, Wang J, Murray JC, Xing Y. Transcriptome landscape of the human placenta. BMC Genomics. 2012;13:115. https://doi.org/10.1186/1471-2164-13-115.
Hoek KL, Samir P, Howard LM, Niu X, Prasad N, Galassie A, et al. A cell-based systems biology assessment of human blood to monitor immune responses after influenza vaccination. PLoS One. 2015;10:e0118528. https://doi.org/10.1371/journal.pone.0118528.
Jiang K, Zhu L, Buck MJ, Chen Y, Carrier B, Liu T, Jarvis JN. Disease-associated single-nucleotide polymorphisms from noncoding regions in juvenile idiopathic arthritis are located within or adjacent to functional genomic elements of human neutrophils and CD4+ T cells. Arthritis Rheumatol. 2015;67:1966–77. https://doi.org/10.1002/art.39135.
Eizirik DL, Sammeth M, Bouckenooghe T, Bottu G, Sisino G, Igoillo-Esteve M, et al. The human pancreatic islet transcriptome: expression of candidate genes for type 1 diabetes and the impact of pro-inflammatory cytokines. PLoS Genet. 2012;8:e1002552. https://doi.org/10.1371/journal.pgen.1002552.
Kissopoulou A, Jonasson J, Lindahl TL, Osman A. Next generation sequencing analysis of human platelet PolyA+ mRNAs and rRNA-depleted total RNA. PLoS One. 2013;8:e81809. https://doi.org/10.1371/journal.pone.0081809.
Chen L, Kostadima M, JHA M, Canu G, Garcia SP, Turro E, et al. Transcriptional diversity during lineage commitment of human blood progenitors. Science. 2014;345:1251033. https://doi.org/10.1126/science.1251033.
Casero D, Sandoval S, Seet CS, Scholes J, Zhu Y, Ha VL, et al. Long non-coding RNA profiling of human lymphoid progenitor cells reveals transcriptional divergence of B cell and T cell lineages. Nat Immunol. 2015;16:1282–91. https://doi.org/10.1038/ni.3299.
Stemcell Technologies. Frequencies of Cell Types in Human Peripheral Blood. https://www.stemcell.com/media/files/wallchart/WA10006-Frequencies_Cell_Types_Human_Peripheral_Blood.pdf. Accessed 21 Mar 2018.
Zhou R, Caspi RR. Ocular immune privilege. F1000 Biol Rep. 2010; https://doi.org/10.3410/B2-3.
Carson MJ, Doose JM, Melchior B, Schmid CD, Ploix CC. CNS immune privilege: hiding in plain sight. Immunol Rev. 2006;213:48–65. https://doi.org/10.1111/j.1600-065X.2006.00441.x.
Fijak M, Meinhardt A. The testis in immune privilege. Immunol Rev. 2006;213:66–81. https://doi.org/10.1111/j.1600-065X.2006.00438.x.
Sciorati C, Rigamonti E, Manfredi AA, Rovere-Querini P. Cell death, clearance and immunity in the skeletal muscle. Cell Death Differ. 2016;23:927–37. https://doi.org/10.1038/cdd.2015.171.
Neisig A, Melief CJ, Neefjes J. Reduced cell surface expression of HLA-C molecules correlates with restricted peptide binding and stable TAP interaction. J Immunol. 1998;160:171–9.
Kulkarni S, Savan R, Qi Y, Gao X, Yuki Y, Bass SE, et al. Differential microRNA regulation of HLA-C expression and its association with HIV control. Nature. 2011;472:495–8. https://doi.org/10.1038/nature09914.
Schaefer MR, Williams M, Kulpa DA, Blakely PK, Yaffee AQ, Collins KL. A novel trafficking signal within the HLA-C cytoplasmic tail allows regulated expression upon differentiation of macrophages. J Immunol. 2008;180:7804–17. https://doi.org/10.4049/jimmunol.180.12.7804.
Blasco-Algora S, Masegosa-Ataz J, Gutiérrez-García ML, Alonso-López S, Fernández-Rodríguez CM. Acute-on-chronic liver failure: pathogenesis, prognostic factors and management. World J Gastroenterol. 2015;21:12125–40. https://doi.org/10.3748/wjg.v21.i42.12125.
Cifuentes-Diaz C, Delaporte C, Dautréaux B, Charron D, Fardeau M. Class II MHC antigens in normal human skeletal muscle. Muscle Nerve. 1992;15:295–302. https://doi.org/10.1002/mus.880150307.
Goebels N, Michaelis D, Wekerle H, Hohlfeld R. Human myoblasts as antigen-presenting cells. J Immunol. 1992;149:661–7.
Rajagopalan S, Long EO. KIR2DL4 (CD158d): an activation receptor for HLA-G. Front Immunol. 2012;3:258. https://doi.org/10.3389/fimmu.2012.00258.
Yao G-D, Shu Y-M, Shi S-L, Peng Z-F, Song W-Y, Jin H-X, et al. Expression and potential roles of HLA-G in human spermatogenesis and early embryonic development. PLoS One. 2014;9:e92889. https://doi.org/10.1371/journal.pone.0092889.
Cirulli V, Zalatan J, McMaster M, Prinsen R, Salomon DR, Ricordi C, et al. The class I HLA repertoire of pancreatic islets comprises the nonclassical class Ib antigen HLA-G. Diabetes. 2006;55:1214–22.
Pangault C, Le Friec G, Caulet-Maugendre S, Lena H, Amiot L, Guilloux V, et al. Lung macrophages and dendritic cells express HLA-G molecules in pulmonary diseases. Hum Immunol. 2002;63:83–90.
Rouas-Freiss N, Moreau P, Lemaoult J, Carosella ED. The dual role of HLA-G in Cancer. J Immunol Res. 2014;2014:1–10. https://doi.org/10.1155/2014/359748.
Pietra G, Romagnani C, Manzini C, Moretta L, Mingari MC. The emerging role of HLA-E-restricted CD8+ T lymphocytes in the adaptive immune response to pathogens and tumors. J Biomed Biotechnol. 2010;2010:907092. https://doi.org/10.1155/2010/907092.
Koller BH, Geraghty DE, Shimizu Y, DeMars R, Orr HT. HLA-E. A novel HLA class I gene expressed in resting T lymphocytes. J Immunol. 1988;141:897–904.
Lee N, Llano M, Carretero M, Ishitani A, Navarro F, López-Botet M, Geraghty DE. HLA-E is a major ligand for the natural killer inhibitory receptor CD94/NKG2A. Proc Natl Acad Sci U S A. 1998;95:5199–204.
Lin A, Zhang X, Ruan Y-Y, Wang Q, Zhou W-J, Yan W-H. HLA-F expression is a prognostic factor in patients with non-small-cell lung cancer. Lung Cancer. 2011;74:504–9. https://doi.org/10.1016/j.lungcan.2011.04.006.
Zhang J-G, Zhang X, Lin A, Yan W-H. Lesion HLA-F expression is irrelevant to prognosis for patients with gastric cancer. Hum Immunol. 2013;74:828–32. https://doi.org/10.1016/j.humimm.2013.03.002.
Gobin SJ, van den Elsen PJ. Transcriptional regulation of the MHC class Ib genes HLA-E, HLA-F, and HLA-G. Hum Immunol. 2000;61:1102–7.
LeibundGut-Landmann S, Waldburger J-M, Krawczyk M, Otten LA, Suter T, Fontana A, et al. Mini-review: specificity and expression of CIITA, the master regulator of MHC class II genes. Eur J Immunol. 2004;34:1513–25. https://doi.org/10.1002/eji.200424964.
Kobayashi KS, van den Elsen PJ. NLRC5: a key regulator of MHC class I-dependent immune responses. Nat Rev Immunol. 2012;12:813–20. https://doi.org/10.1038/nri3339.
Coupel S, Moreau A, Hamidou M, Horejsi V, Soulillou J-P, Charreau B. Expression and release of soluble HLA-E is an immunoregulatory feature of endothelial cell activation. Blood. 2007;109:2806–14. https://doi.org/10.1182/blood-2006-06-030213.
Prigione I, Penco F, Martini A, Gattorno M, Pistoia V, Morandi F. HLA-G and HLA-E in patients with juvenile idiopathic arthritis. Rheumatology (Oxford). 2011;50:966–72. https://doi.org/10.1093/rheumatology/keq418.
Chiesa MD, Vitale M, Carlomagno S, Ferlazzo G, Moretta L, Moretta A. The natural killer cell-mediated killing of autologous dendritic cells is confined to a cell subset expressing CD94/NKG2A, but lacking inhibitory killer Ig-like receptors. Eur J Immunol. 2003;33:1657–66. https://doi.org/10.1002/eji.200323986.
Lee N, Ishitani A, Geraghty DE. HLA-F is a surface marker on activated lymphocytes. Eur J Immunol. 2010;40:2308–18.
Focosi D, Zucca A, Scatena F. The role of anti-HLA antibodies in hematopoietic stem cell transplantation. Biol Blood Marrow Transplant. 2011;17:1585–8. https://doi.org/10.1016/j.bbmt.2011.06.004.
Wright KL. Coordinate regulation of the human TAP1 and LMP2 genes from a shared bidirectional promoter. J Exp Med. 1995;181:1459–71. https://doi.org/10.1084/jem.181.4.1459.
Torikai H, Reik A, Soldner F, Warren EH, Yuen C, Zhou Y, et al. Toward eliminating HLA class I expression to generate universal cells from allogeneic donors. Blood. 2013;122:1341–9. https://doi.org/10.1182/blood-2013-03-478255.
Reimers MS, Engels CC, Putter H, Morreau H, Liefers GJ, van de Velde CJ, PJK K. Prognostic value of HLA class I, HLA-E, HLA-G and Tregs in rectal cancer: a retrospective cohort study. BMC Cancer. 2014;14:486. https://doi.org/10.1186/1471-2407-14-486.
Saliba A-E, Westermann AJ, Gorski SA, Vogel J. Single-cell RNA-seq: advances and future challenges. Nucleic Acids Res. 2014;42:8845–60. https://doi.org/10.1093/nar/gku555.
Li M, Jia C, Kazmierkiewicz KL, Bowman AS, Tian L, Liu Y, et al. Comprehensive analysis of gene expression in human retina and supporting tissues. Hum Mol Genet. 2014;23:4001–14. https://doi.org/10.1093/hmg/ddu114.
Farkas MH, Grant GR, White JA, Sousa ME, Consugar MB, Pierce EA. Transcriptome analyses of the human retina identify unprecedented transcript diversity and 3.5 Mb of novel transcribed sequence via significant alternative splicing and novel genes. BMC Genomics. 2013;14:486. https://doi.org/10.1186/1471-2164-14-486.
We thank Lars Nagel, head of the cloud group at the University of Mainz (ZDV Data Center) for website hosting; Ludmila Schemarow, Thorsten Litzenberger (TRON) and André Brinkman, Tim Süß and Markus Tacke (ZDV data center) for the high performance computing infrastructure. Alexandra G Orlandini von Niessen (TRON) drafted the human body in Fig. 1. We thank all researchers and consortia for providing the raw sequencing data and all the donors and their families. We would like to thank the three reviewers - for their valuable comments that helped to improve the manuscript substantially.
Funding was provided by the Johannes Gutenberg-University in the context of the PhD thesis of SB and by TRON gGmbH.
Availability of data and materials
Ethics approval and consent to participate
Consent for publication
US is co-founder and employee of BioNTech AG (Mainz, Germany). JCC is now employed by Agenus UK Limited. These companies are developing immunotherapies. The remaining authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Table S1. List of tissues and cell types. Table S2. Median HLA expression values of the GTEx and HPA samples (for Fig. 1, Additional file 2: Figure S1). Table S3. Correlation analysis of HLA expression and APM components (normalized according to RPKM). Table S4. Correlation analysis of HLA expression and APM components (normalized according to TPM). Table S5. Median HLA expression of immune cell populations. Table S6. Median HLA expression in stem and progenitor cells. Table S7. Differential expression analysis of brain samples (high HLA vs. Low HLA expression) for Additional file 2: Figure S2. Table S8. Median TAP1 and TAP2 expression values of the GTEx and HPA samples (for Additional file 2: Figure S8) (XLSX 2820 kb)
Figure S1. HLA Class I (A) and II (B) expression body map including anatomical substructures. Figure S2. Immunological characterization of brain samples. Figure S3. Locus specific expression of classical HLA class I (A) and class II (B) genes across all examined tissues. Figure S4. Expression body map of the non-classical HLA Class I genes. (A) HLA-E transcripts are ubiquitously expressed in all tissues. Figure S5. Expression of constitutive proteasome versus immunoproteasome normalized with TPM. Figure S6. Correlation analysis of CIITA and HLA expression. Figure S7. Expression of constitutive proteasome, Immunoproteasome and its inducing cytokines. Figure S8. Expression of TAP1, TAP2 and PSMB9. (PPTX 5223 kb)