- Open Access
A systematic analysis of a broadly neutralizing antibody AR3C epitopes on Hepatitis C virus E2 envelope glycoprotein and their cross-reactivity
BMC Medical Genomicsvolume 8, Article number: S6 (2015)
Hepatitis C virus (HCV) belongs to Flaviviridae family of viruses. HCV represents a major challenge to public health since its estimated global prevalence is 2.8% of the world's human population. The design and development of HCV vaccine has been hampered by rapid evolution of viral quasispecies resulting in antibody escape variants. HCV envelope glycoprotein E1 and E2 that mediate fusion and entry of the virus into host cells are primary targets of the host immune responses.
Structural characterization of E2 core protein and a broadly neutralizing antibody AR3C together with E1E2 sequence information enabled the analysis of B-cell epitope variability. The E2 binding site by AR3C and its surrounding area were identified from the crystal structure of E2c-AR3C complex. We clustered HCV strains using the concept of "discontinuous motif/peptide" and classified B-cell epitopes based on their similarity.
The assessment of antibody neutralizing coverage provides insights into potential cross-reactivity of the AR3C neutralizing antibody across a large number of HCV variants.
Hepatitis C virus (HCV) is a major cause of viral hepatitis, liver cirrhosis, and liver cancer. It was discovered in 1989 as a novel causative agent of hepatitis . HCV is a growing health concern since it affects about 2.8% of the world population and its prevalence is rising [2, 3]. Each year, there are more than 500,000 new HCV infections in Egypt, the country with the highest HCV prevalence . In the United States, more people die from HCV than from human immunodeficiency virus 1 (HIV-1) related disease . Six genotypes and multiple subtypes of HCV have been identified to date. Approximately 75% of Americans with HCV have genotype 1 of the virus (subtypes 1a or 1b), and 20-25% have genotypes 2 or 3, with small numbers of patients being infected with genotypes 4, 5, or 6 . Effective vaccination would provide protection against this global disease. However, the development of HCV vaccine and identification of broadly neutralizing antibodies has been hampered because HCV sequences mutate rapidly generating escape variants , the non-neutralizing antibodies to HCV envelope proteins interfere with neutralizing antibodies , and there is lack of 3D structural information needed for vaccine development . The first crystal structure of broadly neutralizing antibody against HCV has been published in 2013 .
The HCV envelope glycoproteins E1 and E2 form a heterodimer E1E2 that facilitates virus attachment and entry into host cells and are targets for neutralizing antibodies . Recent progress in isolating and characterizing HCV-neutralizing antibodies are instrumental for vaccine discovery and design . These HCV-neutralizing antibodies were isolated from immunized mice [13–15], or from patients chronically infected with HCV [16–20]. Giang et al. , using an exhaustive panning strategy, identified five distinct antigenic regions on the HCV E1E2, that were recognized by 73 human monoclonal antibodies (mAbs) from an HCV immune phage-display antibody library. Many of these antibodies showed broadly neutralizing ability.
Structural characterization of HCV envelope glycoproteins is challenging because of the difficulty in obtaining homogenous protein preparations [10, 21–23]. Recently, the crystal structure of E2 core bound to neutralizing antibody AR3C has been crystalized , The antibody AR3C belongs to a group of broadly neutralizing antibodies that recognize antigenic region 3 (AR3) of E2 protein and cross-neutralizes HCV genotypes by blocking CD81 receptor binding site .
In this study, we characterized the B-cell epitope from the E2c-AR3C structure. By mapping this B-cell epitope to HCV E2 protein sequences, all strains available in the HCV database have been catalogued and compared with the known neutralized HCV strains. We examined the B-cell epitope diversity among the HCV variants, assessed potential cross-neutralization of the broadly neutralizing antibody across all sequences, and provided suggestions for selection of representative strains for future analysis of diversity and cross-recognition of HCV neutralizing B-cell epitopes.
Materials and methods
Structures of neutralizing antibody-E2 core protein complex
HCV envelope glycoproteins E1 and E2 mediate fusion and entry into host cells and are the primary targets of the humoral immune responses. The structure of the E2 core bound to a broadly neutralizing antibody was first crystalized at 2.65 angstroms , and deposited in PDB  database (PDB ID: 4MWF).
Sequences of E2 protein from Hepatitis C virus
All E2 envelope protein sequences of HCV strains were retrieved from HCV database  (http://hcv.lanl.gov/content/index), a database that provides annotated data about HCV sequences. We retrieved 5589 E2 sequences from the HCV database. Of these, 5340 sequences with translated protein sequences were retained in E2 protein dataset, with 3723, 275, 995, 70, 22 and 87 sequences labeled as genotype 1-6, respectively. Among these, 168 sequences were genotype-unclassified isolates or representatives of recombinant strains. Five of the seven neutralizing motifs studied in  were represented in this E2 data set.
Neutralizing activity of monoclonal antibody AR3C
The comparison of mAbs binding to the antigenic regions 1(AR1), 2(AR2), and 3(AR3) showed that 3(AR3)-specific mAbs reacted not only with genotype 1, but also genotype 2a, suggesting the presence of highly conserved epitopes in AR3 . Table 1 shows the neutralizing activity data from this study. The mAb AR3C neutralized multiple genotypes: 1a, 1b, 2a, 2b, 4 and 5. We retrieved the E2 sequences of these isolates from GenBank .
Consistency of strain sequence numbering
All sequences in E2 protein dataset were aligned using MAFFT multiple alignment server . The multiple sequence alignment (MSA) results provided a consistent sequence numbering scheme for further analysis of all sequences.
For each validated strain (Table 1), sequence similarity to all sequences in E2 protein dataset was assessed using BLAST  search. The sequence from E2 protein dataset with the highest identity score was used as the reference sequence. This step also provided a consistent sequence numbering scheme of positions within the MSA results for validated strains.
Identification of B-cell epitope and surrounding area
Usintg crystal structures of the antigen-antibody complex, we defined antigen-binding sites (B-cell epitopes) as described previously [28, 29]. This was done using both the measurements of residue Accessible Surface Area (ASA) and the minimum atom distance to the antibody.
a) For each residue on antigen protein, the ASA value was calculated using Naccess  software for free antigen and for antigen coupled with the corresponding antibody. Residues r i with ASA loss more than 20% were selected as designated epitope residues,
b) The majority of contacts between two interacting atoms occur at <5Å separation. Euclidean distance was calculated between atom a i and a j with their coordinates a i (x i , y i , z i ) and a j (x j , y j , z j ) in the PDB structure data,
Antigen residues r i whose minimum atom distance to the binding antibody is less than 4Å were also incorporated in epitope. The least atom distance was defined as
The residues that satisfy either of these two conditions (ASA loss or the minimum distance thresholds) were considered to constitute a B-cell epitope.
For the definition of surrounding area, we continued to use distance-based method: antigen residues with minimum atom distance to binding antibody less than 6Å, that are not B-cell epitope residues, were incorporated as components of the surrounding area.
Extraction of discontinuous motifs (functional motifs)
Based on the BLAST result, residue positions of a B-cell epitope and its surrounding area identified on the crystal structure were mapped onto its reference sequence, and further transferred to map onto all validated strain sequences (Figure 1). For structure sequence or each of the validated strain sequences, a residue string from these epitope positions was recognized as a discontinuous motif. Since we do not have negative data (escape variants), all discontinuous motifs extracted from these strains were classified as neutralized motifs, which were recognized as functional in neutralizing assays.
Extraction of discontinuous peptides
The concept of discontinuous peptide  describes a virtual linear residue string generated from sequences that combines residues that form B-cell epitope that are not continuous in the protein sequence. Discontinuous peptides were extracted from the E2 protein dataset. Based on the BLAST and MSA results, the residue positions of B-cell epitope and its surrounding area were mapped onto its reference strain sequence, and then mapped onto all sequences in E2 protein dataset (Figure 1). Patterns of discontinuous peptides were used to catalog all strains in the dataset, and they were compared to the functional neutralized motifs. Each discontinuous peptide that has unique sequence was termed a discontinuous motif.
Neutralizing antibody against HCV E2c protein
The mAb AR3C was known to neutralize HCV genotype 1, 2, 4 and 5. We performed the analysis of the structure of mAb AR3C complexed with HCV E2c. The B-cell epitope and its surrounding area in structure 4MWF were identified (Figure 2) as described in the MATERIAL AND METHODS section.
Functional motifs on B-cell epitopes and its surrounding area
The positions of B-cell epitope residues were extracted and mapped to all validated strain sequences. Functional motifs were retrieved with corresponding neutralizing information. Seven distinct discontinuous motifs (identical motifs were present across different strains) were extracted from the sequences of E2 protein structure and 10 validated strains.
Discontinuous peptides derived from B-cell epitopes
The positions of epitope residues were mapped onto all sequences in the E2 protein dataset. Amino acid string representing discontinuous peptide was extracted from each E2 protein sequence. Among all 5340 sequences in E2 protein dataset, there were 402 different combinations of discontinuous peptides (patterns), which reflect the high sequential variability of HCV virus. Five discontinuous peptides identical to discontinuous motifs from neutralized strains covered 14.06% strains population (Figure 3A). The discontinuous peptides were further sorted according to their frequencies in the E2 protein dataset. Viewed by ranked frequencies, the top 10 most frequent discontinuous peptides covered more than 50% strains in the dataset, and top 25 discontinuous peptides covered nearly 80% of the total strain population (Figures 3B and 3C).
Top ranked discontinuous peptides and those identical to the discontinuous motifs extracted from the E2 protein dataset are listed in Table 2 along with their frequencies. The most frequent discontinuous peptide has coverage of 754 strains, while the second most frequent peptide covers 320 strains. There is no validation data for the 3 most frequent discontinuous peptides, while discontinuous motifs ranked 4th, 6th, 11th, 12th, and 26th in the list were shown to be neutralizing. The neutralization potential of these un-tested discontinuous motifs could be estimated by comparing the composing amino acids to the validated motifs. The 1st ranked discontinuous peptide (ILNCNDSLGIALFYKCW) is different from the 4th ranked discontinuous peptide (ILNCNDSLGLALFYRCW, which is a neutralized motifs) in two positions: 10th residue L->I, and 15th residue R->K. Since both residues share similar chemical features, it is possible that the HCV strains with 1st discontinuous peptides could be neutralized by the mAb AR3C. Also, the two different residues have been shown in other validated neutralized motifs: the 26th ranked (ILNCNDSLGIALFYSCW) and 6th ranked (ILNCNESLGLALFYKCW) discontinuous peptides. From the reported neutralizing data, we derived the consensus sequence for B-cell epitope ILNCNDSLGIALFYKCW and experimentally verified E2 neutralizing motif I-L-N-C-[NQ]-[DE]-S-L-G-[IL]-A-L-F-Y-[KNRS]-C-W. Potentially neutralizing motif that should be validated experimentally is [IVL]-L-[NS]-C-[NQ]-[DEA]-[ST]-[LI]-G-[ILVM]-[ATV]-L-[FILM]-Y-X-[WF] (see Additional file 1). Targeted experimentation will identify B-cell epitope changes that would abolish AR3C binding as well as changes that do not have detrimental effects.
WebLogo  and BlockLogo  were generated for all the discontinuous peptides extracted from E2 protein dataset. Among the 17-residue B-cell epitope, most of the positions are quite conserved, as shown in WebLogo figure (Figure 4A). However, the BlockLogo figure shows a large number of different combinations and the high diversity of this binding site (Figure 4B).
The neutralized motifs cover 14.06% of strain sequences in the E2 protein dataset, while the other discontinuous peptides that cover 85.94% of the strains lack validated data (Figure 5). Viewed by the genotype, the neutralizing coverage of genotypes 1, 2 and 4 are approximately 20% (18.48%, 22.18% and 17.14% respectively), higher than those of genotype 3, 5 and 6. The overall known neutralized coverage on the dataset is low. Of 402 discontinuous peptides, 379 had a complete B-cell epitope and 15 had ambiguities in sequence (residue X). Eight sequences had disrupted B-cell epitope (patterns 38, 65, 93, 180, 214, 285, 385, and 387, Additional file 2) most likely representing non-viable viruses.
Discontinuous peptides on B-cell epitope surrounding area
The antibody binding and neutralization ability can possibly be affected by the B-cell epitope surrounding area. Identical discontinuous peptides on B-cell epitope alone cannot fully guarantee the same neutralization result. The analysis of surrounding area aims to provide a more detailed assessment about the potential neutralizing properties of the AR3C. The frequency distribution of different discontinuous peptides on surrounding area showed similarity to the results of B-cell epitope comparisons (Figure 6). For the strains share identical discontinuous peptides on B-cell epitope, the discontinuous peptides on surrounding area have dominant patterns: the top 5 patterns cover as much as 50% of the strains. The result indicates that the residues that define AR3C epitope surrounding area do not affect B-cell epitope/antibody interaction independently of the actual B-cell epitope.
Conclusions and discussion
Hepatitis C virus, with its extreme variability of sequence repertoire, is a difficult target for vaccine design. Compared to envelope glycoproteins in other virus, such as hemagglutinin protein from influenza virus and E protein from dengue (DENV) virus, the B-cell epitopes on HCV E2 protein are much less conserved in composing residues. The MAb F10  is a broadly neutralizing antibody against HA protein of influenza A virus. A total of 589 different patterns of discontinuous peptides on F10 B-cell epitope were extracted from 45,812 HA sequences. The mAb 2H12  is a broadly neutralizing antibody shown to neutralize serotypes DENV1, 3 and 4, has 57 different patterns of discontinuous peptides on B-cell epitope that cover 4,659 dengue E protein sequences in dengue dataset . In the current study, 5340 E protein sequences from HCV, which is a similar sequence set size as in dengue viruses, generated almost an order of magnitude larger diversity: 402 different discontinuous peptides at the mAb AR3C binding site have been identified.
We assembled a HCV strains cataloguing method in this study. Strains with identical discontinuous peptides on B-cell epitope site were grouped and estimated to have similar neutralizing activity. For mAb AR3C, the discontinuous peptides on B-cell epitope from validated strains ranked 4th, 6th, 11th, 12th and 26th, covered 14.06% of all 5,340 strains in the E2 protein dataset. The discontinuous peptide and frequency list could be used as guidance for the selection of representative strains for future systematic neutralizing antibody tests. For example, the most dominant discontinuous peptides among population should be tested for neutralization assay in priority. For mAbs generated in the future, the neutralization coverage among the strains with top dominant discontinuous peptide could be used as a guidance of how broadly neutralized the mAb could reach.
The neutralizing motif indicates that conservative replacements at positions 430 (N→Q), 431 (D→E) and 438 (L→I) will likely not affect binding affinities sufficiently to abolish neutralization. In addition, position 446 has multiple residues observed in neutralized variants (K,N,S,R) and it appears not to affect antibody binding. By observation of common discontinuous peptides we argue that conserved positions 427 (L), 428 (N), 429 (C), 436 (G), 439 (A), 441 (L), 443 (Y), 503 (C), and 529 (W) have structural or functional significance. The positions 422, 431, 432, 433, 438, and 442 are key for the study of the diversity of B-cell epitopes and targeting the design of broadly-protective vaccines.
This results presented here are based on the existing data. More comprehensive conclusions will be generated as additional neutralizing antibody structures are crystallized and more neutralization assays are performed in the future. Advances in computation and biotechnology enable more comprehensive analysis where all combinations of antibodies and antigens can be assessed in silico. The new methodology of Big Data analysis  enables the analysis of diverse data types where protein, nucleotide, structure, and functional data can be analyzed in combination. The well-annotated data are combined with specialized analytical tools, including statistical analyses, sequence analysis, and mathematical models to gain insights into biological processes, generate knowledge, and inform decisions about validation experiments. This study has shown that the majority of common HCV variants have not been studied in antibody neutralization studies. The knowledge of cross-neutralization is, therefore, incomplete and there is an urgent need for designing libraries of viruses that will be representative of the majority of HCV strains. These libraries will enable systematic testing of strains against the panels of antibodies and enable the design of universal broadly protective HCV vaccines.
Choo QL, et al: Isolation of a cDNA clone derived from a blood-borne non-A, non-B viral hepatitis genome. Science. 1989, 244 (4902): 359-62.
Mohd Hanafiah K, et al: Global epidemiology of hepatitis C virus infection: new estimates of age-specific antibody to HCV seroprevalence. Hepatology. 2013, 57 (4): 1333-42.
Lavanchy D: Evolving epidemiology of hepatitis C virus. Clin Microbiol Infect. 2011, 17 (2): 107-15.
Giang E, et al: Human broadly neutralizing antibodies to the envelope glycoprotein complex of hepatitis C virus. Proceedings of the National Academy of Sciences. 2012, 109 (16): 6205-6210.
Ly KN, et al: The increasing burden of mortality from viral hepatitis in the United States between 1999 and 2007. Ann Intern Med. 2012, 156 (4): 271-8.
McHutchison JG, et al: Interferon alfa-2b alone or in combination with ribavirin as initial treatment for chronic hepatitis C. Hepatitis Interventional Therapy Group. N Engl J Med. 1998, 339 (21): 1485-92.
Gal-Tanamy M, et al: In vitro selection of a neutralization-resistant hepatitis C virus escape mutant. Proc Natl Acad Sci USA. 2008, 105 (49): 19450-5.
Zhang P, et al: Hepatitis C virus epitope-specific neutralizing antibodies in Igs prepared from human plasma. Proc Natl Acad Sci USA. 2007, 104 (20): 8449-54.
Corti D, Lanzavecchia A: Broadly neutralizing antiviral antibodies. Annu Rev Immunol. 2013, 31: 705-42.
Kong L, et al: Hepatitis C virus E2 envelope glycoprotein core structure. Science. 2013, 342 (6162): 1090-4.
Dustin LB, Rice CM: Flying under the radar: the immunobiology of hepatitis C. Annu Rev Immunol. 2007, 25: 71-99.
Lindenbach BD, et al: Complete replication of hepatitis C virus in cell culture. Science. 2005, 309 (5734): 623-6.
Balazs AB, et al: Antibody-based protection against HIV infection by vectored immunoprophylaxis. Nature. 2012, 481 (7379): 81-4.
Broering TJ, et al: Identification and characterization of broadly neutralizing human monoclonal antibodies directed against the E2 envelope glycoprotein of hepatitis C virus. J Virol. 2009, 83 (23): 12473-82.
Sabo MC, et al: Neutralizing monoclonal antibodies against hepatitis C virus E2 protein bind discontinuous epitopes and inhibit infection at a postattachment step. J Virol. 2011, 85 (14): 7005-19.
Schofield DJ, et al: Human monoclonal antibodies that react with the E2 glycoprotein of hepatitis C virus and possess neutralizing activity. Hepatology. 2005, 42 (5): 1055-62.
Johansson DX, et al: Human combinatorial libraries yield rare antibodies that broadly neutralize hepatitis C virus. Proc Natl Acad Sci USA. 2007, 104 (41): 16269-74.
Law M, et al: Broadly neutralizing antibodies protect against hepatitis C virus quasispecies challenge. Nat Med. 2008, 14 (1): 25-7.
Berman HM, et al: The Protein Data Bank. Nucleic Acids Res. 2000, 28 (1): 235-42.
Keck ZY, et al: Human monoclonal antibodies to a novel cluster of conformational epitopes on HCV E2 with resistance to neutralization escape in a genotype 2a isolate. PLoS Pathog. 2012, 8 (4): e1002653-
Michalak JP, et al: Characterization of truncated forms of hepatitis C virus glycoproteins. J Gen Virol. 1997, 78 (Pt 9): 2299-306.
Krey T, et al: The disulfide bonds in glycoprotein E2 of hepatitis C virus reveal the tertiary organization of the molecule. PLoS Pathog. 2010, 6 (2): e1000762-
Whidby J, et al: Blocking hepatitis C virus infection with recombinant form of envelope protein 2 ectodomain. J Virol. 2009, 83 (21): 11078-89.
Kuiken C, et al: The hepatitis C sequence database in Los Alamos. Nucleic Acids Res. 2008, 36 (Database): D512-6.
McKeating JA, et al: Diverse hepatitis C virus glycoproteins mediate viral infection in a CD81-dependent manner. J Virol. 2004, 78 (16): 8496-505.
Katoh K, Toh H: Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform. 2008, 9 (4): 286-98.
Altschul SF, et al: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-402.
Hubbard SJ, Thornton JM: Naccess. Computer Program, Department of Biochemistry and Molecular Biology. 1993, University College London, 2 (1):
Schneider TD, Stephens RM: Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990, 18 (20): 6097-100.
Olsen LR, et al: BlockLogo: Visualization of peptide and sequence motif conservation. Journal of immunological methods. 2013, 400: 37-44.
Sui J, et al: Structural and functional bases for broad-spectrum neutralization of avian and human influenza A viruses. Nat Struct Mol Biol. 2009, 16 (3): 265-73.
Midgley CM, et al: Structural analysis of a dengue cross-reactive antibody complexed with envelope domain III reveals the molecular basis of cross-reactivity. J Immunol. 2012, 188 (10): 4971-9.
Sun J, et al: Landscape of neutralizing assessment of monoclonal antibodies against dengue virus. Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics. 2013, ACM
Crooks GE, et al: WebLogo: a sequence logo generator. Genome Res. 2004, 14 (6): 1188-90.
Publication charges for this article have been funded by Nazarbayev University.
This article has been published as part of BMC Medical Genomics Volume 8 Supplement 4, 2015: Joint 26th Genome Informatics Workshop and 14th International Conference on Bioinformatics: Medical genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcmedgenomics/supplements/8/S4.
The authors declare that they have no competing interests.
VB and JS designed the study. JS collected HCV data from public database and performed the analysis. VB and JS draft the manuscript.