Enrichment of ultraconserved elements among genomic imbalances causing mental delay and congenital anomalies
© Martínez et al. 2010
Received: 13 December 2009
Accepted: 23 November 2010
Published: 23 November 2010
Skip to main content
© Martínez et al. 2010
Received: 13 December 2009
Accepted: 23 November 2010
Published: 23 November 2010
The ultraconserved elements (UCEs) are defined as stretches of at least 200 base pairs of human DNA that match identically with corresponding regions in the mouse and rat genomes, albeit their real significance remains an intriguing issue. These elements are most often located either overlapping exons in genes involved in RNA processing or in introns or nearby genes involved in the regulation of transcription and development. Interestingly, human UCEs have been reported to be strongly depleted among segmental duplications and benign copy number variants (CNVs). However no comprehensive survey of a putative enrichment of these elements among pathogenic dose variants has yet been reported.
A survey for UCEs was performed among the 26 cryptic genomic rearrangements detected in our series of 200 patients with idiopathic neurodevelopmental disorders associated to congenital anomalies. A total of 29 elements, out of the 481 described UCEs, were contained in 13 of the 26 pathogenic gains or losses detected in our series, what represents a highly significant enrichment of ultraconserved elements. In addition, here we show that these elements are preferentially found in pathogenic deletions (enrichment ratio 3.6 vs. 0.5 in duplications), and that this association is not related with a higher content of genes. In contrast, pathogenic CNVs lacking UCEs showed almost a threefold higher content in genes.
We propose that these elements may be interpreted as hallmarks for dose-sensitive genes, particularly for those genes whose gain or loss may be directly implied in neurodevelopmental disorders. Therefore, their presence in genomic imbalances of unknown effect might be suggestive of a clinically relevant condition.
A current challenge of medical genetics is aimed to disentangle the relationship between the genomic data provided by the introduction of array comparative genomic hybridization (array-CGH) and the phenotypic consequences of the gains and losses observed along the whole genome. The classical criteria of assigning a pathogenic condition to any de novo alteration cannot be always applied, as exceptions in both senses can occur: benign copy number variants (CNVs) can arise de novo (unpublished results) while some pathogenic alterations are associated to such a broad phenotypic spectrum that they may have been inherited from apparently healthy parents . Comparative genomics may provide an invaluable tool in the task of differentiating benign from pathogenic CNVs or, in other words, to evaluate which genes may be dose-sensitive or not. In this context, the real significance of the ultraconserved elements (UCEs) in the human genome remains an intriguing issue. There are 481 UCEs, defined as stretches of at least 200 base pairs of human DNA that match identically with corresponding regions in the mouse and rat genomes . They are widely distributed in the genome (on all the chromosomes except chromosomes 21 and Y) and are often found in clusters. For unknown reasons, these regions are under a negative selection much stronger than that operating in coding sequences and have been evolutionarily conserved for 300 million years, before mammal and bird ancestors diverged . Of the 481 ultraconserved elements, 111 overlap the mRNA of a known gene, 256 show no evidence of transcription, and for the remaining 114 the evidence for transcription is inconclusive. These elements are most often located either overlapping exons in genes involved in RNA processing or in introns or nearby genes involved in the regulation of transcription and development. These elements are frequently found in genes post-transcriptionally regulated by alternative splicing events of exons with premature stop codons. Accordingly, the extreme genomic conservation has been associated to regulatory splicing events maintaining tightly regulated levels of RNA-binding proteins . On the other hand, intergenic elements are frequently flanked by developmental genes, in particular for genes involved in early developmental tasks, suggesting that many of the associated ultraconserved elements may be distal enhancers of these early developmental genes .
Functional studies suggested that these elements show a tissue-specific in vivo enhancer activity in a mouse transgenic reporter assay that tended to recapitulate aspects of the expression pattern found in genes that were in their proximity . On the other hand, the removal of four of these UCEs, located near genes that exhibit marked phenotypes in murine models, failed to reveal any overt variation of growth, longevity, pathology or metabolism. As the authors concluded, these results indicate that extreme sequence constraint does not necessarily reflect crucial functions required for viability, although not all the possible phenotypic impact was evaluated .
Interestingly, human UCEs have been reported to be strongly depleted among segmental duplications and benign copy number variants . However, no comprehensive survey of a putative enrichment of these elements among pathogenic dose variants, and more specifically among those rearrangements directly related with neurodevelopmental disorders, has yet been attempted.
Pathogenic imbalances and UCEs
Initial pos.a (Mb)
Final pos.a (Mb)
uc.194, uc.195-uc.200, uc.201, uc.202
uc.194, uc.195-uc.200, uc.201, uc.202
Analysis of frequencies of UCEs
% of genome
The parental origin of the chromosome bearing the alteration could be determined in 21 cases through microsatellite segregation analyses. Ten rearrangements occurred as de novo events in the paternal chromosome and four in the maternal chromosome, while the remaining alterations were inherited from the carrier mother (in five cases) or from the father (two cases). These inherited CNVs were X-linked or associated to mild affectation in the carrier parent. Remarkably, 24 out of the 29 UCEs were contained in the rearrangements inherited from the father or originated in a paternally derived chromosome. However, this association might well be an indirect consequence of a higher proportion of deletions originated in the paternally derived chromosome. Both paternal and maternal deletions show a similar enrichment ratio for UCEs (3.9 and 3.0, respectively), while the paternal and maternal duplications contain a lower than expected ratio (0.9 and 0.0, respectively).
On the other hand, it is noteworthy that none element was found in our series among the 93 CNVs considered as polymorphic variants, some of them not previously reported, which altogether cover about 23.7 Mb (see Additional file 1). This absence of ultraconserved elements among benign CNVs almost reached significance (see Table 2), in accordance with previous studies described above.
We found a highly significant enrichment of ultraconserved elements among pathogenic imbalances causing neurodevelopmental disorders. It can be therefore suggested that these elements, or their neighbouring genes, might be considered particularly sensitive to dose imbalances.
Relation between presence of UCEs and gene content
No. Gene annotations (RefSeq)
No. Gene annotations (CCDS)
Pathogenic CNVs with UCEs
Pathogenic CNVs without UCEs
Another putative argument for a random association between pathogenic CNVs and UCEs might be derived from the fact that many UCEs appear clustered. However we found a similar number of isolated and clustered ultraconserved elements in our series (13 and 16, respectively). In fact, the isolated elements were over-represented as compared with clustered elements. In the 2.67% of the genome are present 10% of the isolated elements (13/131) and 4.6% of the clustered elements (16/350), what represent the respective enrichment ratios of 3.7 and 1.7. In any case, clustering does not modify the a priori probability to find these elements, because every imbalanced region represents an independent 'sampling' event. In order to check the representativeness of the CNVs detected in our series, we analysed the frequencies of other sentinel elements, microRNAs/snoRNAs, present in a similar order of magnitude in the genome than UCEs, and that also tend to cluster. These elements were evenly represented in any category of CNVs in frequencies very close to the expected (enrichment ratios near 1, see Table 2). Even the slight excess among duplications (observed/expected = 1.87) may well be interpreted as reflecting a higher content of genes.
To explain the lack of association of UCEs with duplications in our series, several factors (not mutually exclusive) may be argued, such as a patient-selection bias or the fact that duplications are on average slightly larger and clearly more gene-rich than deletions. On the other hand, there is an excess of duplications in the X chromosome. A high incidence of duplications on the X chromosome contributing to mental retardation has been recently reported , what might be related to the fact that the resulting gene dosage in males is higher than with any other chromosome (a dose increase of 100% instead of 50%). This special condition might also helps to explain that more genes are clinically relevant if duplicated in the X chromosome, and consequently more genes can be potentially pathogenic by duplication even when they are not tightly regulated genes, for instance by ultraconserved elements.
Although our results did not reach significance (see Table 2), we have confirmed a depletion of UCEs among benign CNVs as previously described . Derty and collaborators additionally found that most of the ultraconserved elements present in benign copy number variants overlapped exons. These exonic UCEs are present in many genes encoding well-known RNA-binding proteins, while intergenic UCEs are preferentially flanked by developmental genes, particularly involved in early developmental tasks . We found that both exonic and intergenic ultraconserved elements appear to be equally represented in the CNV regions associated to disease, what advocates for a dose-sensitive character of the nearby genes in either case.
In summary, we have found that pathogenic CNVs show an enrichment of ultraconserved elements, conversely to benign CNVs. It can be argued that since UCEs are often associated with genes involved in RNA processing and developmental tasks, especially these genes are dosage-sensitive and hence, a heterozygous deletion/duplication including such a gene will more likely result in disease.
In the view of the association between ultraconserved elements and pathogenic dose variants, we therefore propose that these elements may be interpreted as hallmarks for dosage-sensitive genes, particularly for those genes whose gain or loss may be directly implied in neurodevelopmental disorders. Obviously, the presence of this kind of elements in CNVs of unknown consequences should be used with caution together with other pathogenicity criteria, such as the gene content or the mode of inheritance. This is an important issue given the current limitation in some instances to differentiate between pathogenic and benign rare copy number variants.
Genomic DNA from 200 patients and their parents was purified by standard proteinase K/phenol-chloroform procedures. All the patients showed idiopathic mental retardation associated to congenital anomalies, non assignable to known syndromes after clinical examination by two specialists. Informed parental consent, as approved by our Hospital Review Board, was obtained prior to research studies. Patients C1 to C5 were studied by clone-based array CGH as reported elsewhere . Patients M1 to M3 were detected in the initial screening for microdeletion syndromes by commercial MLPA (SALSA P245; MRC-Holland, Amsterdam), following the recommendations of the manufacturer. All the remaining cases were studied by array-based comparative genomic hybridization (human genome CGH microarray AMADID: 014950, from Agilent Technologies, Palo Alto, CA) as recommended. The patients' DNA samples were tested against a pool of 10 sex-matched normal DNA samples, all of them (patients and normal controls) from our geographical area. Confirmatory analyses and familial studies were done by microsatellite marker segregation analyses and commercial or home-made MLPA studies (primers and conditions available upon request).
The CNVs detected with both kinds of arrays were collected together, defined by the distal ends of the first and last probe altered. It is worth to note that most benign and pathogenic CNVs were detected by the commercial oligonucleotides-based array because of its higher resolution and because it was applied to 95% patients. On the other hand, they can be considered complementary, as many of the small polymorphic CNVs previously detected in the clone-based array  could not be refined in the oligo-array because of lack of probes in such regions, designed in order to avoid frequent polymorphic CNVs.
Item counts for the presence of ultraconserved elements, sno/miRNAs and genes in the regions encompassed by CNVs were performed through the UCSC Genome Browser , based on build hg18. The delimiting positions of all the ultraconserved elements, tracked in the hg16 version, were converted to built hg18 through the 'Convert' Feature http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#Convert and compiled in an in-home excel sheet to facilitate visual inspection. In every case, a confirmation of the positions was performed by employing as reference the previously known delimiting genes in the CNVs. Items contained in overlapping rearrangements were considered once, avoiding duplications of items or sizes.
In order to measure the strength of association, we employed the observed/expected ratio (called 'enrichment ratio' in tables 2 and 3). The expected frequencies were computed as the product of the total number of elements (for instance, n = 481 UCEs) by the proportion of genome examined (pi = ∑iMb/Mbgenome). Frequency analyses were performed by the Pearson's chi-square goodness-of-fit test, employing the observed and expected frequencies previously computed.
This work was supported by grants PI040421 and PI080648 from the Fondo de Investigacion Sanitaria (Spanish Ministry of Health)/FEDER (Fondo Europeo de DEsarrollo Regional) and ACOMP/2009/216 (Conselleria d'Educació, Generalitat Valenciana). RQ and SM are supported by Fundación Bancaja fellowships. We want to thank the collaborative participation of all the families and referring clinicians to the development of this work.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.