Dealing With Unspeci c Clinical Phenotypes in Molecular Autopsy-HPO- Driven Whole Exome Sequencing Analysis Versus Gene Panel Testing

Ulrike Schoen MGZ: Medizinisch Genetisches Zentrum Anna Holzer Institute of Legal Medicine, LMU Munich Andreas Laner MGZ: Medizinisch Genetisches Zentrum Stephanie Kleinle MGZ: Medizinisch Genetisches Zentrum Florentine Scharf MGZ: Medizinisch Genetisches Zentrum Anna Benet-Pagès MGZ-Medical Genetics Center Munich Oliver Peschel Institute of Legal Medicine, LMU Munich Elke Holinski-Feder MGZ: Medizinisch Genetisches Zentrum Isabel Diebold (  isabel.diebold@mgz-muenchen.de ) MGZ: Medizinisch Genetisches Zentrum https://orcid.org/0000-0002-1753-563X


Introduction
Sudden death (SD) of apparently healthy individuals is amongst the most challenging scenarios in clinical medicine. Sudden cardiac death (SCD) is the predominant cause of SD, with structural cardiovascular abnormalities often evident at autopsy. 10-30% of SD remain unexplained by conventional forensic autopsy procedures (the so-called sudden unexplained death, SUD). One-fourth of these autopsy-negative SUD cases harbored an underlying pathogenic variant. Over 100 SD-predisposing cardiac channelopathy-, cardiomyopathy-, and metabolic disorder-susceptibility genes have been identi ed. Thus, molecular autopsy (post-mortem genetic testing) by high-throughput sequencing (HTS) technology represents an e cient tool to assess these diagnosis (1)(2)(3). The importance of molecular autopsy lies in its ability to identify pathogenic variants and thereby enabling risk prediction of asymptomatic relatives. However, identi cation of a causative variant in an individual who did not present with a speci c clinical phenotype before the SUD is still challenging.
Since most of the reported likely causal variants were found in genes associated with cardiac disease a xed panel-based approach with a limited number and clinically well-de ned genes is commonly used for identifying the genetic causes of SUD (4)(5)(6)(7). Nevertheless, the overall diagnostic yield of a xed gene panel is limited. In comparison, whole exome sequencing (WES) has diagnostic power to identify potentially pathogenic variants also in rare causative genes, which have not been associated with SUD before and thereby elucidating novel pathomechanisms (8). For this reason, WES is increasingly used in clinical settings and represents the primary alternative to gene panel testing. However, data interpretation remains challenging because of a high incidence of variants of unknown signi cance (VUS) and the possible false assignment of variant pathogenicity. As both WES and targeted panel sequencing yield accurate genetic diagnoses, clinicians are faced with the challenge of deciding which method to use. To improve variant interpretation in WES, the Human Phenotype Ontology (HPO) was developed as a semantically computable international standardized vocabulary to capture phenotypic abnormalities in human (9). Although, the number of HPO terms has grown substantially since the clinical integration and use of this ontology was established (10). Another obvious advantage of phenotype-driven ltering is that recently identi ed genes are automatically associated by the HPO term in the HPO database. In comparison, the design of each targeted gene panel needs to be curated over time. Groza and coworkers developed a concept-recognition procedure that analyzes the frequencies of HPO disease annotations as identi ed in over ve million PubMed abstracts by employing an interactive procedure to optimize precision and recall of the identi ed terms (11).
Here, we performed WES in tissue samples of 16 individuals with SUD after autopsy and provided a practical guide for ltering and prioritizing genetic variants by a speci c set of HPO terms (explaining a SUD), thus creating a "virtual panel" instead of using a xed-panel approach.

Samples and preparation
Autopsies on 16 SUD cases (9 adults, 23-53 years and 7 infants, 4 weeks to 9 months) were performed by forensic pathologists including general autopsy investigations, toxicology and histology. Cases were included if no speci c cause of SD could be established at the medicolegal investigation. DNA samples for WES were extracted from post-mortem liver and/or heart tissue. On request we have been informed by the ethics committee that a vote is not needed as all investigations were made after the release of con scated tissues by the public prosecution o ce and complete anonymization. Due to anonymization, cosegregation analyses of the variants were not performed.
High throughput sequencing and bioinformatics pipeline Next-generation sequencing analysis (NGS) of a custom capture kit (Agilent SureSelectXT) was carried out on an Illumina NextSeq 500 system (Illumina, San Diego, CA) using v2.0 SBS chemistry. Sequencing reads were aligned to the human reference genome (GRCh37/hg19) using BWA (v0.7. 13-r1126). SNV, CNV and INDEL calling on the genes was conducted using the varvis software platform (varvis™, Limbus Technologies GmbH, Rostock) subsequent coverage and quality dependent lter steps.
Human Phenotype Ontology (HPO) structure and selection of terms to create a virtual gene panel The HPO currently contains over 13,000 terms. Most ontologies are structured as directed acyclic graphs, which are similar to hierarchies but differ in that a more specialized term can be related to more than one less specialized term. The HPO terms used for variant ltering in our study were selected with the goal of covering phenotypic abnormalities that explain an unexpected sudden natural death. We selected the HPO term "arrhythmia" (HP: 0011675, associated with 356 genes), which belongs to the subclass abnormality of cardiovascular system electrophysiology. We added the HPO term "sudden cardiac death" (HP: 0001645, associated with 72 genes) for variant ltering, which belongs to the category "cardiac arrest". Since SUD is a fatal complication of seizures without recovery (12), we added the speci c HPO term "status epilepticus" (HP: 0002133, associated with 131 genes) which belongs to the category "seizure". Since a lack of breathing may result in SD, we selected the HPO term "apnea" (HP: 0002104, associated with 266 genes) from the category "Abnormal pattern of respiratory". Taken together, all cases of the study were annotated with the following set of HPO terms: arrhythmia, sudden cardiac death, status epilepticus and apnea. Overall, 672 different genes were associated with the selected HPO terms, thus creating a HPO-driven "virtual gene panel". HPO project data are available at http://www.human-phenotype-ontology.org. (Release: August 2020).

Nomenclature, interpretation and classi cation of genetic variants
The nomenclature guidelines of the Human Genome Variation Society (HGVS) were used to annotate DNA sequence variants (13). The functional consequence of missense variants was interpreted with the amino acid (AA) substitution effect prediction methods SIFT (Sorting Invariant from Tolerated;  (Table 1). Four variants were associated with the HPO term "arrhythmia", seven with the HPO term "sudden cardiac death", two with the HPO term "status epilepticus" and one with the HPO term "apnea" (Table 1).
Interestingly, the majority of potentially causative variants was identi ed in infants. Six of seven infants carried at least one potentially causative variant, and three of nine adults carried at least one potentially causative variant. Eleven potentially causative variants were identi ed in 16 individuals with post autopsy unclear sudden death Stringent ltering in combination with HPO annotation ended up in eleven candidate variants, three of them have not been identi ed before (Table 1). Nine variants were missense and two splice variants. Both splice variants were classi ed as likely pathogenic. The splice variant c.81 + 1G > C was found heterozygous in DSG2 which was annotated by the HPO term "sudden cardiac death". The variant was identi ed in an eight months old infant and is not listed in population-speci c databases, but at the same position a nucleotide change from G to T is listed in gnomAD (rs1237620145, gnomAD MAF: 0.003%). The rare truncating variant is located within the second exon of DSG2 and is predicted to cause a splice donor malfunction. The variant was classi ed as likely pathogenic. The other splice variant c.917-1G > A was found homozygous in a three months old infant in UPB1, annotated by the HPO term "status epilepticus". The variant is listed in population-speci c databases (rs143493067, gnomAD MAF: 0.18%), and is classi ed as likely pathogenic/pathogenic in LOVD and with "con icting interpretation" of pathogenicity in ClinVar.
The nine missense variants were classi ed as VUS (Table 1). Four variants in adults and four variants in children were located in genes that previously have been reported to be associated with cardiac channelopathies and cardiomyopathies, respectively. One variant was identi ed in SCN4A in a four weeks old infant and another in SCN8A in a three months old infant. Two individuals carried two VUS in different genes (Table 1).

Discussion
A key challenge in using WES in molecular autopsy is nding the true causal variant among hundreds of rare variants. By ltering genes known to be associated with a particular HPO term, we shift the analysis focus from the entire exome to that part of the exome that is clinically interpretable in a diagnostic setting. Instead of using a xed panel-based approach, we designed a HPO-driven virtual gene panel, with the advantage, that recently identi ed genes are automatically associated by HPO terms in the HPO database and developed an algorithm that prioritized 1.4% of the variants by several ltering steps.
The two likely pathogenic variants found in our study, were detected in children (< 12 months). Genetic studies in SIDS cohorts collectively suggest that up to 15-20% of SIDS cases might be explained by inherited cardiac diseases not detectable during conventional forensic autopsy investigations (15)(16)(17).
However, our data further highlight, that interpretation of putative pathogenic variants in SIDS is challenging.
The homozygous variant in UPB1 was annotated by the HPO term "status epilepticus" and has been recently published to trigger seizures due to ßureidopropionase (UPB) de ciency in a recessive mode of inheritance (18). Assmann et al. reported the same variant also homozygous in a four months old boy with an acute life threatening event (ALTE) with febrile status epilepticus (19). The extent of the reduction in enzyme activity caused by a particular UPB1 variant, along with other genetic and environmental factors may determine whether people with UPB de ciency develop neurological problems and the severity of these problems. Therefore, in many affected individuals with absent or mild neurological problems, the condition may never be diagnosed, and may thus explain that the here identi ed variant has been found homozygous in one of 141,426 genomes from unrelated individuals. Importantly, epileptic seizures can induce malignant arrhythmias, possibly due to seizure-related effects on the autonomic nervous system (20). However, the homozygous likely pathogenic variant in UPB1, recently associated to status epilepticus, has not been linked to SD before. Thus, a xed gene panel-based approach consisting well-known genes linked to SD would have missed the variant in UPB1. In comparison, the HPO-dirven virtual panel is a exible system that does not have to be adjusted over time as new genes are added.
The second identi ed likely pathogenic variant was detected in DSG2 in an eight months old girl. Pathogenic variants in DSG2 are associated with arrhythmogenic right ventricle cardiomyopathy (ARVC), a disease that importantly predominantly affects adults in the 4th or 5th decade of life. If ARVC is diagnosed in the infantile stage, there should be clearly identi able morphologic changes of the heart ( brosis, dilation, fatty in ltration) before death occurs.
Nevertheless, another study identi ed variants in DSG2 associated with SUD in infants (21), indicating that interpretation of variants in context with the age of the individual at the SUD event is challenging.
Beside the two likely pathogenic variants, we identi ed nine VUS. The majority of the VUS has been identi ed in genes previously having been reported to be associated with cardiac channelopathies (SCN5A, AKAP9, RYR2) and cardiomyopathies (RBM20, RAF1) (2,(22)(23)(24). One variant was identi ed in SCN4A in a four weeks old infant. SCN4A variants are described as cause of autosomal-dominant myotonia and periodic paralysis (25). Affected members developed in utero-or neonatal-onset muscle weakness of variable severity. In seven cases, severe muscle weakness resulted in death during the third trimester or shortly after birth (26). Interestingly, variants in SCN4A have also been reported in patients with clinical diagnosis of Brugada syndrome, a primary arrhythmia syndrome (27). Another potentially causative variant was identi ed in SCN8A in a three months old infant. Pathogenic variants in SCN8A have been associated with a wide spectrum of epilepsy phenotypes, ranging from benign familial infantile seizures to epileptic encephalopathies with variable severity (28). Now, there are no forensic guidelines on the management and interpretation of VUS. Grassi et al recently discussed the main elements and issues that differentiate the forensic management of cases in which VUS are found (29). Our data highlight that HPO-based ltering could be used as complementary approach in particular to prioritize VUS by HPO-matches. Before one of the candidate variants can be de ned as "causative variant" further investigations (f.e. co-segregation analyses, functional studies) are needed. To date, many studies that used HTS identi ed putatively pathogenic variants in molecular autopsy but only a small number performed co-segregation analysis. Due to complete anonymization, co-segregation analyses of the variants cannot be performed.
Campuzano et al demonstrated the value of co-segregation in SUD (30). The presence of rare variants in asymptomatic family members aided the exclusion of some variants as being causative of the SUD. Glengarry and co-workers reported that co-segregation studies are challenging to perform especially if the proband is an infant, due to di culties in tracking families once a pathogenic variant which explains SD is found (31).
Our data further highlight that phenotype and genotype data should be used in conjunction to prioritize variants for further evaluation and may thus increase the overall solve rate especially in cases without speci c clinical phenotypes like SD. In particular, HPO provides a structured, comprehensive and an international standard that could be used for developing algorithms and computational tools for clinical differential diagnostic in SUD.

Conclusion
Molecular autopsy should be included in forensic protocols when no conclusive cause of death is identi ed. Prioritization of variants by a speci c set of HPO terms could be used as a complement approach to perform a diagnosis in molecular autopsy. Identi cation of causative variants in molecular autopsy of SUD can allow prevention of SD in relatives.

Declarations
Ethics approval and consent to participate: On request we have been informed by the ethics committee that a vote is not needed as all investigations were made after the release of con scated tissues by the public prosecution o ce and complete anonymization. Due to anonymization, co-segregation analyses of the variants were not performed.
Consent for publication: Not applicable" in this section.
Availability of data and materials: The variants classi ed during the current study are available at https://databases.lovd.nl/shared/variants.
Con icts of interest/Competing interests: The authors declare no con ict of interest/competing interests.
Funding: The study was supported by Agilent (materials for exome sequencing).