A de novo 2.2 Mb recurrent 17q23.1q23.2 deletion unmasks novel putative regulatory non-coding SNVs associated with lethal lung hypoplasia and pulmonary hypertension: a case report

Background Application of whole genome sequencing (WGS) enables identification of non-coding variants that play a phenotype-modifying role and are undetectable by exome sequencing. Recently, non-coding regulatory single nucleotide variants (SNVs) have been reported in patients with lethal lung developmental disorders (LLDDs) or congenital scoliosis with recurrent copy-number variant (CNV) deletions at 17q23.1q23.2 or 16p11.2, respectively. Case presentation Here, we report a deceased newborn with pulmonary hypertension and pulmonary interstitial emphysema with features suggestive of pulmonary hypoplasia, resulting in respiratory failure and neonatal death soon after birth. Using the array comparative genomic hybridization and WGS, two heterozygous recurrent CNV deletions: ~ 2.2 Mb on 17q23.1q23.2, involving TBX4, and ~ 600 kb on 16p11.2, involving TBX6, that both arose de novo on maternal chromosomes were identified. In the predicted lung-specific enhancer upstream to TBX4, we have detected seven novel putative regulatory non-coding SNVs that were absent in 13 control individuals with the overlapping deletions but without any structural lung anomalies. Conclusions Our findings further support a recently reported model of complex compound inheritance of LLDD in which both non-coding and coding heterozygous TBX4 variants contribute to the lung phenotype. In addition, this is the first report of a patient with combined de novo heterozygous recurrent 17q23.1q23.2 and 16p11.2 CNV deletions.

Here, we describe a deceased newborn with neonatal PAH and pulmonary interstitial emphysema with features suggestive of PH in whom molecular analyses revealed a de novo heterozygous recurrent CNV deletion on 17q23.1q23.2 with additional non-coding variants at the same locus, concomitant with a de novo heterozygous recurrent CNV deletion on 16p11.2.

Case presentation
A female patient, born at 38 weeks' gestation, was the first child of non-consanguineous parents. The pregnancy was uneventful and the amniotic fluid was noted to be of normal volume. Her birth weight was 3370 g and Apgar scores were 9 at 1 and 5 min. She was discharged to her mother but was found to be cyanotic at 5 hours of life and subsequently admitted to the neonatal intensive care unit (NICU). She required immediate intubation and ventilation. An echocardiogram showed a structurally normal heart but marked PAH. Ultrasound of the patient's brain and abdomen was within normal limits. Her condition, however, deteriorated and she died within 14 h after admission to the NICU.

Histopathological evaluation
Histopathological evaluation was performed using formalin-fixed paraffin wax-embedded tissue from postmortem lung biopsies. Samples were examined by light microscopy using routine hematoxylin and eosin (H&E), Verhoeff's van Gieson (EVG), periodic acid-Schiff-diastase (PAS-D), Perls' Prussian blue and Masson's trichrome stains. Post mortem lung biopsy of the right upper and middle lobes showed evidence of pulmonary hypertension and interstitial emphysema, with features suggestive of PH ( Fig. 1a-c). On H&E staining, the general arrangement of the pulmonary arteries and veins was normal. The lung tissue, however, appeared hypoplastic with respiratory bronchioles noted very close to the pleural surface. The pulmonary arterial vessels were thick-walled, and there was a peripheral extension of smooth muscle into some of the alveolar septa, which were widened without increased cellularity or fibrosis. The interlobular septa were edematous, and there was marked lymphatic dilation and "hanging vessels", consistent with pulmonary interstitial emphysema. The EVG, PAS-D, Perls' Prussian blue, and Masson trichrome stains did not demonstrate any interstitial fibrosis or other abnormalities.

Molecular analyses
Samples were collected from the proband (P094, lung tissue) and his parents (blood) after obtaining written informed consent. The study protocol was approved by the Institutional Review Board for Human Subject Research at Baylor College of Medicine (H-8712).
Array comparative genomic hybridization (array CGH) was performed using proband's DNA sample and a customized high-resolution 180 K microarray (Agilent Technologies, Santa Clara, CA, USA) with additional probes targeting genes involved in lung development, as described [31]. Whole genome sequencing (WGS) for the family trio was performed with a TruSeq Nano DNA HT Library Prep Kit (Illumina, San Diego, CA, USA) and the HiSeqX platform (Illumina) with mean coverage depth 30X at CloudHealth Genomics (Shanghai, China) and the data was processed according to previously described protocol [31]. Parental origin of the identified deletions was determined using informative single nucleotide variants (SNVs) from critical trio WGS analysis. Array CGH revealed two pathogenic de novo heterozygous recurrent CNV deletions:~2.2 Mb on 17q23.1q23.2, involving TBX4, and~0.6 Mb on 16p11.2, involving TBX6, both flanked by complex lowcopy repeats. The probability of occurrence of these two CNV deletions in one individual is approximately 5e-10. A trio-based WGS analysis confirmed these findings ( Fig. 2a, b, Additional file 1) and showed that they both arose on the maternal chromosomes (Additional file 2).

Computational analysis
The enrichment of non-coding variants within and upstream to TBX4 was analyzed using WGS data obtained from the presented newborn and the previously described cohort of eight patients with LLDD and 17q23.1q23.2 deletion as well as 13 control individuals with the same deletion but without any structural lung abnormalities [31]. Only variants with MAF < 10% (gno-mAD r2.0.2) carried by at least two individuals with lung disease and absent in controls were considered in the analysis [31]. To test whether there is an excess of selected variants in a given region A, a Monte Carlo approach was used. We estimated the empirical distribution of the number of variants selected in the previous step that fall into randomly selected genomic intervals of the fixed size (equal to the size of region A) sampled from the 17q23.1q23.2 deletion region. P-value was calculated by dividing the number of intervals containing the same number or more variants than in the region A by the total number of sampled intervals.
Analysis of a common haplotype defined by one synonymous SNV rs2289292 and two non-coding SNVs rs3809624 and rs3809627 in TBX6, previously associated with congenital scoliosis in up to 11% of Han Chinese with 16p11.2 deletion and present in 44% of Han Chinese, did not reveal its presence in our patient.

Discussion and conclusion
TBX2, TBX4, and TBX6 are members of the T-box family transcription factors that are important regulators of embryonic development in vertebrates [36]. All T-box proteins share a conserved T-box motif interacting with specific DNA sequences to repress or activate transcription [36]. T-box genes are expressed in numerous tissues in a highly specific manner and mutations or CNVs containing T-box family members have been associated with different developmental disorders [37]. One of the most well characterized syndromes involving a T-box gene is DiGeorge/ Velocardiofacial/ chromosome 22q11.2 deletion syndrome caused by deletion of TBX1 and characterized by congenital heart disease, immune deficiency, and developmental delay [3][4][5]. Other examples include ulnar-mammary syndrome (MIM# 181450) associated with TBX3 mutations [38] or Holt-Oram syndrome caused by TBX5 haploinsufficiency [39,40]. Interestingly, there are also reports presenting overlapping features of these two syndromes in patients with contiguous deletion of both TBX3 and TBX5 [41]. TBX2 abnormalities have been associated with a cardiovascular and skeletal developmental disorder [25,42]. Recently, we and others have described heterozygous recurrent and nonrecurrent CNV deletions on 17q23.1q23.2, involving TBX2 and TBX4, as well as de novo heterozygous missense TBX4 variants [30][31][32][33] in patients with PH and other lethal pulmonary abnormal growth conditions. PH is a group of rare lung developmental diseases histopathologically characterized by a reduction of the number and size of bronchioles and alveoli [43,44]. While PH is usually secondary to underlying disorders limiting fetal lung growth (i.e. diaphragmatic hernia, skeletal abnormalities, or oligohydramnios), primary PH (MIM# 265430) is related to an embryologic defect of lung branching morphogenesis and vasculogenesis [45,46]. The consequence of PH is Fig. 2 Schematic representation of 17q23.1q23.2 copy-number variant deletion region. a The 17q23.1q23.2 region (hg19) depicting the identified deletion in the presented patient with pulmonary hypoplasia. The genes mapping within the deletion, including TBX4, and complex low-copy repeats flanking the recurrent deletion are shown. b Alignment tracks showing whole genome sequencing coverage at 17q23.1q23.2 region in the father, mother, and child (upper, middle, and bottom track, respectively). c Distribution of single nucleotide variants (SNVs) in the putative lung-specific enhancer region located upstream to TBX4, identified in present subject (red) and other patients with lethal lung developmental disorders (black), are presented. Variants reported previously and also detected in present case are indicated by black dashed rectangles [31]. Chromatin state annotation track based on ChIP-seq mapping (Roadmap) in the IMR-90 cell line within the chr17:59,278,024-59,462,062 genomic region, as well as H3K27Ac, and H3K4Me1 marks found in fetal lung are shown below the SNVs track severe respiratory distress and PAH, typically refractory to therapy [44].
TBX2 and TBX4 are essential for normal development, including proper lung organogenesis [37]. Dysregulation of these genes in mice leads to a reduction of lung branching [47,48], supporting the notion that 17q23.1q23.2 CNV deletions, detected in our newborn and other patients, are causative for their lethal lung phenotypes. Although SNVs and CNVs involving TBX4 confer a risk of lung disease, the heterogeneity of clinical features associated with TBX4 abnormalities suggests that they are not sufficient to lead to specific phenotypes and that lung phenotype cannot be explained by TBX4 haploinsufficiency alone. We proposed a model of complex compound inheritance of LLDD [31]. Importantly, along with TBX4 abnormalities, reported individuals with LLDDs were found to also have at least one rare or common non-coding SNV within an~200 kb interval map-ping~70 kb upstream to TBX4 and overlapping the predicted lung-specific enhancer [49], suggesting that this second risk allele with the putative hypomorphic variants in trans may affect TBX4 and is required to cause a lethal lung disease [31]. In our patient, in this region, we have identified seven non-coding SNVs that are absent in 13 control subjects [31] with the same CNV deletion but without any structural lung anomalies. Notably, three of these variants: rs3785850, rs35383405, and rs143541906 were previously identified in LLDD children, making them better candidates [31]. However, the small size of our control group may be a limitation of this study.
To date, abnormalities involving three different T-box transcription factors have not been reported. It is unclear whether the 16p11.2 deletion contributed to the patient's phenotype [14,15]. Compound inheritance of 16p11.2 CNV deletion or coding SNV involving TBX6 with the non-coding common T-C-A risk haplotype in trans has been associated with congenital vertebral malformations [22][23][24]. However, we did not find this noncoding haplotype in our patient and there was no evidence of congenital scoliosis or spondylocostal dysostosis that might have led to secondary pulmonary hypoplasia. Moreover, neither TBX6 nor any other gene mapping within the 16p11.2 CNV deletion has been associated with lung development or function in humans. These data and the fact that 11 other patients with similar lethal lung developmental disorders and pathogenic heterozygous CNV deletions or SNVs involving TBX4 [31] did not have any clinically relevant variants involving TBX6 argue against the 16p11.2 CNV deletion contribution to the abnormal lung phenotype and two-hit hypothesis for CNVs proposed by Giriraian et al. [50].
Both CNV deletions in our patient arose de novo on maternal 17q23.1q23.2 and 16p11.2 chromosomes. While the maternal origin of the 16p11.2 CNV deletion in the presented child further confirms the findings that 89.4% of de novo 16p11.2 CNV deletions arose on the maternal chromosome 16 [51], there is an insufficient number of studied de novo 17q23.1q23.2 CNV deletions to conclude about their parental origin.
Multi-locus genomic variations and dual molecular diagnoses involving SNVs or CNVs have been increasingly described [52,53]. While a combination of two different SNVs is the most commonly detected in patients with dual molecular diagnosis, the combination of CNVs and SNVs or two various CNVs have been rarely observed [52][53][54]. Examples of co-occurrence of two de novo CNVs include deletion of 22q11 and 10p14 in a patient with overlapping features of both 22q11 deletion syndrome and hypoparathyroidism, sensorineural deafness, and renal disease [55], 6q13q14.1 and 6q21q22.31 CNV deletions in a patient with Pierre Robin sequence and developmental delay [56] or recurrent CNV deletions of 7q11.23 and 22q11.2 in a patient with an unique phenotype and features specific for Williams and DiGeorge/ Velocardiofacial syndromes [57]. Analysis of a large cohort of children with CNV associated with intellectual disability and congenital abnormalities revealed the presence of a second CNV in 10.1% of studied individuals [58]. However, in the vast majority of these cases, at least one large CNV event was inherited from one of the parents [58]. In summary, we present the clinical and molecular findings in a newborn with PAH and pulmonary interstitial emphysema with features suggestive of PH, leading to respiratory failure and neonatal death on the first day of life in whom we detected de novo 17q23.1q23.2 and 16p11.2 CNV deletions. We have identified novel candidate regulatory SNVs in the potential lung-specific enhancer region mapping upstream to TBX4, as well as three variants previously detected in LLDD children patients. Our data further support the complex compound inheritance model for LLDDs due to a combination of rare coding variants involving TBX4 with rare and common non-coding variants in trans.
Additional file 1. Schematic representation of 16p11.2 copy-number variant (CNV) deletion region. A) The 16p11.2 CNV region (hg19) depicting the identified deletion in the presented patient with pulmonary hypoplasia. The genes mapping within the deletion and complex low-copy repeats flanking the recurrent deletion are shown. B) Alignment tracks showing whole genome sequencing coverage at 16p11.2 CNV region in the father, mother, and child (upper, middle, and bottom track, respectively).
Additional file 2. The list of single nucleotide variants used for determination of the parental origin of 16p11.2 and 17q23.2 copynumber variant deletions.
Additional file 4. Non-coding single nucleotide variants in the lungspecific enhancer region, identified in newborns with 17q23.1q23.2 copynumber variant deletion or TBX4 mutation and lethal lung disease and absent in the control individuals with the same deletion but without lung abnormalities.