Skip to main content

Shared etiology of Mendelian and complex disease supports drug discovery

Abstract

Background

Drugs targeting disease causal genes are more likely to succeed for that disease. However, complex disease causal genes are not always clear. In contrast, Mendelian disease causal genes are well-known and druggable. Here, we seek an approach to exploit the well characterized biology of Mendelian diseases for complex disease drug discovery, by exploiting evidence of pathogenic processes shared between monogenic and complex disease. One way to find shared disease etiology is clinical association: some Mendelian diseases are known to predispose patients to specific complex diseases (comorbidity). Previous studies link this comorbidity to pleiotropic effects of the Mendelian disease causal genes on the complex disease.

Methods

In previous work studying incidence of 90 Mendelian and 65 complex diseases, we found 2,908 pairs of clinically associated (comorbid) diseases. Using this clinical signal, we can match each complex disease to a set of Mendelian disease causal genes. We hypothesize that the drugs targeting these genes are potential candidate drugs for the complex disease. We evaluate our candidate drugs using information of current drug indications or investigations.

Results

Our analysis shows that the candidate drugs are enriched among currently investigated or indicated drugs for the relevant complex diseases (odds ratio = 1.84, p = 5.98e-22). Additionally, the candidate drugs are more likely to be in advanced stages of the drug development pipeline. We also present an approach to prioritize Mendelian diseases with particular promise for drug repurposing. Finally, we find that the combination of comorbidity and genetic similarity for a Mendelian disease and cancer pair leads to recommendation of candidate drugs that are enriched for those investigated or indicated.

Conclusions

Our findings suggest a novel way to take advantage of the rich knowledge about Mendelian disease biology to improve treatment of complex diseases.

Peer Review reports

Background

Traditional drug development pipeline is costly and slow. It is estimated that around $2.6 billion is spent and approximately 12–15 years are required for just one new drug to reach the market [1, 2]. Additionally, clinical trial success rates remain low [3]. Therefore, there is a pressing need for new approaches to predict which drugs will succeed.

Recently, genetics has emerged as a resource for predicting drug success. Genome-wide association studies (GWAS) have identified genetic variants associated with complex diseases that are also known therapeutic drug targets [4]. For instance, mutations in the IL23R locus have been associated with Crohn’s disease [5]. Ustekinumab, a monoclonal antibody originally approved for the treatment of psoriasis, targets the IL23 p-40 subunit [6]. Based on genetics, ustekinumab was successfully repurposed for Crohn’s disease [7,8,9,10]. This example highlights the importance of genetics both in target prioritization and drug discovery.

Nelson et al. analyzed historical data of clinical trials and found that drug uses supported by human genetic evidence are twice as likely to succeed in clinical trials [11]. Four years later, King et al. confirmed these findings by analyzing data not available at the time of the Nelson et al. study, further emphasizing the importance of genetics in the drug development process [12]. Moreover, King et al. found that the success rate of a drug is higher when the disease causal gene is clearly identified, as in the case of monogenic (Mendelian) diseases. However, identifying the disease causal genes from GWAS can be challenging as the majority of the GWAS hits are located in non-coding regions [13]. In contrast, in monogenic (Mendelian) diseases, the causal genes are both well-known and druggable [14]. Then, developing a way to translate knowledge about Mendelian disease biology to complex diseases could have a significant impact on their treatment.

We have previously exploited clinical data to discover associations between Mendelian and complex diseases [15, 16]. In a systematic analysis of 90 Mendelian and 65 complex diseases, Blair et al. used health records to identify which complex diseases individuals with a Mendelian disease are predisposed to, finding 2,908 clinically associated (comorbid) pairs of Mendelian and complex diseases [15]. That study showed evidence that comorbidity can be tied to pleiotropic effects of disease genes. In a follow-up study focusing on cancers, Melamed et al. showed that Mendelian disease causal genes are likely to be frequently mutated in comorbid cancers [15]. For instance, patients with Rubinstein Taybi syndrome, a Mendelian disease caused by mutations in CREBBP [17], are predisposed to lymphoma [18]. CREBBP is one of the most frequently inactivated genes in lymphoma [19] meaning that the observed comorbidity can be attributed to a pleiotropic effect of CREBBP mutation causing Rubinstein Taybi syndrome and contributing to lymphoma.

Both of the above studies systematically demonstrate that Mendelian disease comorbidity can suggest a role of Mendelian causal genes on complex disease. However, Mendelian disease comorbidity has not been previously used for complex disease drug discovery purposes. Building on previous findings, we hypothesize that if the Mendelian disease causal genes contribute to the development of a comorbid complex disease, then these genes can be novel therapeutic targets for that disease (Fig. 1A).

Methods

Recommending drugs for complex disease based on Mendelian disease comorbidities

We download results assessing comorbidity between 95 Mendelian diseases and 65 complex diseases from the supplementary materials from Blair et al. (available online) [15]. Our goal is to find drugs targeting Mendelian disease causal genes and recommend them as candidate drugs for the comorbid complex disease. Therefore, we remove 5 Mendelian diseases (and their comorbidities) that are due to chromosomal abnormalities and the causal gene is not obvious (Down Syndrome, Edward Syndrome, Klinefelter Syndrome, Patau Syndrome, Turner Syndrome). We use the remaining 2,908 comorbidity pairs between 90 Mendelian diseases and 65 complex diseases in our main analysis. Additionally, we group the 65 complex diseases into 6 major disease categories as annotated in Blair et al.’s work [15]: cardiovascular [4], hormonal [8], immune [19], neoplasms [14], neurological [15], ophthalmological [5].

We systematically code all the 65 complex diseases using MeSH codes. To do this, we download the supplementary “Table S2” from Blair et al. [15]. (available online) which contains the ICD-10 billings codes that the authors used to identify the complex diseases. We manually match them to relevant MeSH codes. Eventually, we have 230 unique MeSH codes for 65 complex diseases (median of 3 MeSH codes per complex disease).

We download information for 5,800 drugs from DrugBank (version 5.19, date of download: June 16, 2022, https://www.drugbank.com/). After filtering for gene targets in humans, for each drug, we keep its DrugBank ID and its gene targets (HGNC symbols). Drugs included in this file can be approved, investigational, small molecule, biotech, experimental, nutraceutical, illicit, or withdrawn. We do not filter the list of drug-gene targets based on pharmacological action.

To suggest drug repurposing candidates for a complex disease, we first find its comorbid Mendelian diseases. We obtain the genes causally associated with these Mendelian diseases from the OMIM. Using drug-gene target information from DrugBank, we find drugs targeting the Mendelian disease causal genes and we suggest them as candidate drugs for the complex disease.

Finding investigated drugs for the complex diseases

To find drugs currently investigated for the 65 complex diseases in our sample, we download clinical trial data from the Aggregate Content of ClinicalTrials.gov (AACT) database in a pipe-delimited format (data of download: November 4, 2022; note that it is updated daily) [20]. AACT (https://aact.ctti-clinicaltrials.org/) is a publicly available relational database that contains extensive information about every study registered in ClinicalTrials.gov. We obtain information for 432,597 clinical trials that were registered in ClinicalTrials.gov by the date of download. For each clinical trial, we keep the clinical trial ID, clinical trial phase, conditions (diseases) studied and interventions (drugs) tested. We filter out clinical trials that tested behavioral, device, diagnostic test, dietary supplements, procedures, radiation, or “other interventions”, and do not provide MeSH terms for both conditions and interventions. For the total of 109,430 clinical trials that remain, we match the MeSH terms of both conditions and interventions to MeSH codes using the Unified Medical Language System (UMLS) database (https://www.nlm.nih.gov/research/umls/index.html). We group clinical trial phases to Phase I (Phase I and Early Phase I), Phase II (Phase II and Phase I/Phase II), Phase III (Phase III and Phase II/Phase III) or unknown phase (no information provided). Phase IV studies are conducted after a drug gets approved to find long-term benefits and side-effects that could not be discovered in the duration of a clinical trial. Therefore, we consider drugs in Phase IV clinical trials as indicated drugs (see “Finding indicated drugs for the complex diseases”). Eventually, we have 31,053 clinical trials that tested interventions for the 65 complex diseases in our sample.

For these clinical trials, we convert the MeSH codes of interventions to DrugBank IDs using the UMLS API (crosswalk function). However, DrugBank does not assign IDs to drug combinations. In order to include them in our analysis, we convert the MeSH codes that did not match directly to a DrugBank ID, to RxNORM CUIs using the UMLS API (crosswalk function). Then, for each drug combination, we obtain each active pharmaceutical ingredient using the UMLS API (“Retrieving Source-Asserted Relations”; vocabulary = RXNORM; relation label = has_part).

Moreover, MeSH vocabulary assigns different codes to each form of an active pharmaceutical ingredient. But DrugBank assigns IDs only to the general forms. For example, liposomal doxorubicin and doxorubicin are two separate entries in MeSH vocabulary but not in DrugBank (doxorubicin). To deal with this discrepancy, we follow the steps described above and we obtain the active pharmaceutical ingredients using the UMLS API (“Retrieving Source-Asserted Relations”; vocabulary = RXNORM; relation label = form_of). Then, for each drug, we convert its RxNORM CUI to DrugBank ID using the UMLS API (crosswalk function).

Eventually, we have 29,758 clinical trials that tested 1,795 drugs for 64 complex diseases in our sample. Note that the complex disease “Dermatitis herpetiformis“ did not have any investigated drugs at the time of this study.

Finding indicated drugs for the complex diseases

To find drugs that are currently indicated for the 65 complex diseases, we download 4,225 approved drugs from DrugBank (version 5.19, date of download: June 16, 2022). We get their indications by combining information from RxNORM (https://www.nlm.nih.gov/research/umls/rxnorm/index.html) and repoDB [21], as described below.

RxNORM is maintained by the National Library of Medicine and is a high-quality database that uses structured terminologies to provide well defined drug-indication pairs that are updated monthly: each drug is mapped to its RxNORM ID, and each indication is matched to a MeSH term. Using the RxNORM API (getClassByRxNormDrugName function), we obtain diseases (in MeSH terms) with a relationship of “may_treat” or “may_prevent” with each approved drug. We then match the diseases to MeSH codes using the UMLS database.

repoDB is a publicly available database that contains drug repositioning successes and failures by integrating data from DrugCentral and ClinicalTrials.gov. We download the full database (last update: 2017) and, for each drug, we keep its DrugBank ID and approved indication(s), after excluding the ones with a note of suspended, terminated, or withdrawn. All indications are coded in UMLS CUIs, so we easily convert them to MeSH codes using the UMLS database.

After combining the data from RxNORM and repoDB, we have 939 unique drugs indicated for 58 complex diseases. We then add to this data set the drugs in clinical trials Phase IV to get a total of 1,373 unique drugs indicated for 64 complex diseases. Note that the complex disease “hypotony of the eye” did not have any indicated drug at the time of this study.

Statistical analysis

Logistic regression to evaluate candidate drugs

We find 781 unique drugs that target the causal genes of the 90 Mendelian diseases. Using these drugs and the 65 complex diseases, we create a table where each row is a drug-complex disease pair. Therefore, in our main analysis, the number of rows in this table is 50,765 (781 drugs multiplied by 65 complex diseases). We assess whether our recommended drug-disease pairs are predictive of current investigated or indicated drug-disease pairs in a logistic regression model also adjusting for (i) the disease category of each complex disease; (ii) the number of known gene-targets per drug.

We account for the disease category due to differences in the number of drugs investigated or indicated among the 6 disease categories tested in this study. For example, significantly more drugs are tested in clinical trials for neoplasms than ophthalmological diseases (Fig. 1B). Additionally, we account for the number of targets per drug as drugs with a higher number of known targets are more likely to be linked to a Mendelian disease and may be more likely to be subject to research investment for new indications.

To ensure that class imbalance does not bias the coefficient estimate in our model, we also conduct a weighted logistic regression. This is a widely used approach to compensate for class imbalance, and we compute class weights by using the default sci-kit learn setting “balanced”. This setting assigns weights to each class by dividing the total number of samples by the product of the number of classes and the number of samples in each class. As shown in Figure S1, the results from the weighted logistic regression are also significant and comparable to the non-weighted logistic regression. Therefore, we use a non-weighted logistic regression in all of our analyses.

Permutation tests

To assess the significance of the observed associations, we perform permutation tests.

In our main analysis, we want to assess whether the enrichment of approved drugs within our drug candidates is due to signals from the comorbidity relationships. So, we shuffle the Mendelian-complex disease pairs. This random shuffling changes which complex diseases are comorbid with each Mendelian disease, while keeping unchanged all the intrinsic Mendelian disease characteristics, such as prevalence, number of comorbidities, and associated causal genes. This allows us to assess if the observed association is solely attributed to the Mendelian disease comorbidity or not.

In the per Mendelian disease analysis, we want to assess if the higher druggability of a Mendelian disease gene rather than the information about comorbidity drives the results. The random permutation changes which drugs target a Mendelian disease gene by keeping unchanged the total number of drugs targeting the Mendelian disease gene.

In both cases, we perform 1,000 permutations to create a null distribution of odds ratios using the logistic regression model above. We then compare the observed odds ratio to this null distribution. We calculate the probability of observing an odds ratio at least as extreme as the original one by estimating the number of times a permuted odds ratio is higher or equal to the observed odds ratio (odds_ratioobserved ≤ odds_ratiopermutation). A result is considered significant if the calculated probability is less than 0.05 (ppermutation<50/1000).

Genetic similarity between Mendelian diseases and cancers

We assess the genetic similarity between a Mendelian disease and a cancer using two sources of evidence, inspired by the work of Melamed et al [16]. First, we consider the extent of genetic overlap between two diseases. This simple metric captures the shared driver genes between two diseases. To capture a wider range of functional relationships between two diseases, we also use gene co-expression across diverse human tissues.

The genetic overlap metric tests the significance of the overlap between the Mendelian disease causal genes and the genes significantly altered in a cancer. For each Mendelian disease, we compile a list of causally associated genes using the OMIM database. For each cancer, we compile a list of driver genes using the Broad GDAC Firehose database (https://gdac.broadinstitute.org/) including genes significantly mutated (as identified by MutSig v2.0, q < 0.05) and genes with significant copy number alterations (as identified by Gistic2; q < 0.05; peaks with at maximum 50 genes). Then, for a Mendelian disease and cancer pair, we test the significance of the overlap between the set of Mendelian disease causal genes and the set of genes significantly altered in cancer (Fisher’s exact test, p < 0.05).

The co-expression metric is based on the rationale that genes with similar expression patterns across normal tissues also have similar functions. Therefore, it helps us to infer functional similarities between sets of Mendelian and cancer-related genes. To assess genetic similarity using this metric, we first download summarized expression data for 20,162 genes across 37 GTEx tissues from the Human Protein Atlas (https://www.proteinatlas.org/download/rna_tissue_gtex.tsv.zip). We remove 889 genes that do not have expression data across all 37 tissues. The remaining data contained expression for 574 out of the 594 Mendelian disease causal genes. Consequently, the co-expression metric of one Mendelian disease (“Familial Dysautonomia”) with any cancer could not be measured. For every cancer and Mendelian disease pair, the metric tests co-expression between any cancer-related gene and the set of Mendelian disease causal genes. More specifically, for each disease pair, we calculate the correlation of expression between the set of known Mendelian disease genes and each cancer-related gene. We also compute the correlation of expression between the same set of Mendelian genes and all other, non-Mendelian disease genes. Then, using the Wilcoxon rank-sum test, we test whether the set of Mendelian disease genes exhibit stronger correlation in expression with the cancer gene compared to the correlation distribution of all other genes with the same cancer gene. Finally, we adjust the resulting p-values to account for the number of cancer genes tested (Benjamini-Hochberg method; p < 0.05).

We define a Mendelian disease and cancer pair as genetically similar only if at least one of the above metrics is significant (p < 0.05). It is worth mentioning that traditional genetic correlation techniques are not suitable for the estimation of genetic similarity between (monogenic) Mendelian and complex diseases, such as cancers, due to differences in variant discovery and effect size quantification, penetrance and linkage disequilibrium.

Results

Integrating data to test Mendelian diseases as a resource for drug repurposing candidates

From Blair et al. [15]., we obtain clinical associations between 2,908 pairs of a Mendelian and a complex disease. The data include 90 Mendelian diseases with known causal genes and 65 complex diseases across six disease categories (cardiovascular, hormonal, immune, neoplasms, neurological, ophthalmological). Figure 1B shows the distribution of the number of comorbid complex diseases per Mendelian disease. Using drug-gene target information from DrugBank and Mendelian causal genes from the Online Mendelian Inheritance in Man (OMIM), we compile a list of 781 drugs that target the Mendelian disease causal genes. This allows us to suggest candidate drugs for each complex disease based on its comorbid Mendelian diseases (Fig. 1A).

To test our hypothesis, we compare our candidate drugs against drugs currently investigated or indicated for the complex diseases. We curate 29,758 clinical trials that investigate 1,795 drugs for 64 complex diseases (median of 110 investigated drugs per complex disease). In addition to the investigated drugs, we compile current approved drug uses, including 1,373 indicated drugs for 64 complex diseases (median of 42 indicated drugs per complex disease) (Fig. 1C).

Fig. 1
figure 1

Outline of the approach. A. Proposed method where the drugs targeting genes causally associated with a Mendelian disease are suggested as candidate drugs for its clinically associated (comorbid) complex disease. This hypothesized connection between the drug and the complex disease is based on the previously shown pleiotropic effects of the Mendelian disease causal genes on the development of the comorbid complex disease. B. Distribution of the number of comorbid complex diseases per Mendelian disease. C. Number of investigated (per clinical trial phase) and approved drugs for each complex disease

Mendelian disease comorbidity identifies drugs under current investigation or indication

First, we assess whether the candidate drugs for a complex disease are enriched for those currently investigated or indicated for that disease. Accounting for the number of gene targets per drug and the category of disease, we find that the candidate drugs are significantly enriched for drugs currently investigated or indicated (odds ratio = 1.834, p = 5.98e-22) (Fig. 2A).

Next, we seek to exclude artifactual explanations for this signal. One such artifact is variation in disease frequency: disease frequency can impact both the ability to discover disease genes and the power to discover clinical associations. To exclude this spurious source of association, we randomly permute which complex diseases each Mendelian disease is comorbid with. This random permutation preserves the characteristics of each Mendelian disease, such as number of complex disease comorbidities, but not the list of candidate drugs for each complex disease. After 1,000 permutations, we find that the observed association is significantly stronger than expected by chance (ppermutation<0.001). For Mendelian diseases with many comorbidities, the permutation does not impact recommendations; therefore the permuted signal, though weaker, still has odds ratio > 1. However, when these Mendelian diseases are removed, comorbidity is significantly predictive of drug uses, while the permuted comorbidity is not predictive (Figure S2).

We repeat the analysis at a disease category level, and we find significant results for neurological, immune, neoplasms, ophthalmological, and hormonal diseases. However, after permutation analyses, only neurological, immune, and neoplasm disease categories remain significant (ppermutation<0.05) (Fig. 2A, S3-S8). This may be due to the low number of analyzed complex diseases that fall under the cardiovascular (n = 4), ophthalmological (n = 5), and hormonal (n = 9) disease categories, compared to neurological (n = 15), immune (n = 19) and neoplasms (n = 14), potentially reducing the statistical power to detect a significant association. Additionally, these disease categories have a lower number of current therapies (Fig. 1C). Figure 2B shows an example of our recommended candidate drugs for 14 neoplasms, illustrating an extensive overlap between the candidate drugs and the drugs currently investigated or indicated for these neoplasms. The full list of recommended candidate drugs for repurposing for each complex disease can be found in Supplementary Tables 1 and 2.

Next, we ask if the candidate drugs are more likely to be in advanced drug development phases for the relevant complex diseases. To test this, we stratify drugs by their drug development phase for a complex disease: phase I, phase II, phase III, indicated. First, we find a significant enrichment of candidate drugs for a complex disease among drugs in any of phase I, II, or III (odds ratio = 1.59, p = 2.27e-09; ppermutation<0.001) (Figure S9). Stratified per clinical trial phase (phase I, II, or III), we find a progressive increase in the enrichment for drug success with increasing phase (ppermutation<0.05) (Fig. 2C, S10-12). Additionally, when considering only indicated drugs for a complex disease, we find an even greater enrichment of its candidate drugs for drug success (odds ratio = 2.18, p = 5.94e-11; ppermutation<0.001) (Fig. 2C, S13). Overall, our predicted drug candidates show more enrichment in categories with more clinical evidence, supporting the potential of our approach for identifying new successful drugs.

Fig. 2
figure 2

Clinical associations between Mendelian and complex diseases predict candidate drugs with higher potential of success for the complex diseases. (A) Odds ratio of candidate drugs to be currently investigated or indicated for the complex diseases within a disease category. Only disease categories that are significant compared to 1,000 permutations of the comorbidity relationships (ppermutation<0.05) are shown. (B) Examples of recommended candidate drugs for 14 neoplasms based on their clinical associations with 2 Mendelian diseases: Androgen Insensitivity Syndrome and Retinitis Pigmentosa. Gray-scaled boxes indicate the phase of the drug in the development pipeline for each neoplasm. (C) Odds ratio of candidate drugs to be investigated or indicated for a complex disease per drug development phase. * in A and C, bars represent the observed odds ratio with 95% confidence intervals

Prioritizing Mendelian diseases targeted by high number of drugs

Mendelian disease causal genes are known to be good drug targets [14]. We find 193 out of 593 Mendelian genes (32.6%) to be targeted by at least one drug (median: 2 drugs per Mendelian gene). However, outliers exist: androgen receptor (AR), a gene mutated in Androgen Insensitivity Syndrome, is targeted by 82 drugs [22]. This variation in drug targeting of Mendelian genes may suggest that certain disease processes are more druggable. We hypothesize that the most druggable Mendelian diseases are the most promising for providing insight into complex disease therapeutics.

To test this hypothesis, we repeat the above analysis for each Mendelian disease individually, for Mendelian diseases targeted by at least one drug (n = 68) (Fig. 3A). That is, we test whether the drugs targeting the causal genes of each Mendelian disease are enriched for drugs currently investigated or indicated for its comorbid complex diseases. Although testing only the drugs targeting a single Mendelian disease reduces the statistical power of the analysis, we find 8 significant Mendelian diseases (ppermutation<0.05). Further, we find that these 8 Mendelian diseases are targeted by a significantly higher number of drugs than other Mendelian diseases (p = 9.1e-05, one-sided Wilcoxon rank-sum test) (Fig. 3B). To exclude the possibility that this is due only to higher numbers of drugs increasing power to discover an association, we compare the result against a permutation analysis that permutes the drugs targeting each Mendelian disease (ppermutation=0.018).

In another test of this hypothesis, we ask which Mendelian disease genes successfully point to new drug indications. That is, for each Mendelian disease gene, we use comorbidity to suggest which complex diseases may benefit from drugs targeting that gene. Under our hypothesis, we expect that highly druggable genes can more successfully be used for finding new drug uses. To test this, we repeat the above analysis for each gene targeted by at least one drug (n = 193) (Fig. 3C), comparing the association to permutations. We find 12 significant genes, and these successful genes are again targeted by a higher number of drugs compared to the other Mendelian disease genes (p = 2.3e-05, one-sided Wilcoxon rank-sum test) (Fig. 3D). Altogether, these results imply that Mendelian diseases associated with more druggable genes are a particularly promising resource for complex disease therapeutics.

Fig. 3
figure 3

Highly drugged Mendelian diseases are a better resource for candidate drugs. (A) Histogram of the number of drugs targeting a Mendelian disease, for Mendelian diseases targeted by at least one drug (n = 68). (B) Mendelian diseases that significantly predict candidate drugs already investigated or indicated for their comorbid complex diseases are targeted by a higher number of drugs compared to the other Mendelian diseases (p = 9.1e-05, one-sided Wilcoxon rank-sum test comparing the number of drugs in each group). (C) Histogram of number of drugs targeting a Mendelian gene, for genes targeted by at least one drug (n = 193). (D) Genes linked to Mendelian diseases that significantly predict candidate drugs already investigated or indicated for their comorbid complex diseases are targeted by a higher number of drugs compared to the other genes (p = 2.3e-05, one-sided Wilcoxon rank-sum test comparing the number of drugs in each group)

Combining comorbidity with genetic similarity enhances drug predictions

Comorbidity is a way to discover diseases sharing a biological basis, but it is not the only way. Comorbid Mendelian and complex diseases have been shown to be more likely to share related or overlapping genes, which is known as genetic similarity [15, 16]. Additionally, genetic similarity between drug targets and disease-linked genes has also been shown to predict successful drugs for a disease [11, 12]. Building on these results, we propose that genetic similarity could contribute to discovering therapeutically relevant shared etiology of Mendelian and complex diseases [23, 24]. Specifically, we propose that by combining comorbidity with genetic similarity, the two forms of evidence can more robustly point to diseases with shared etiology, increasing the predictive success of our approach.

To test this hypothesis, we focus on cancers, one of the disease categories with the strongest association in our analysis (Fig. 2A). Cancers are also of interest because each type of cancer has been associated with a set of recurrently mutated driver genes in The Cancer Genome Atlas (TCGA); we previously showed that Mendelian diseases comorbid with a cancer are enriched for genetic similarity to somatically mutated cancer driver genes [16]. Building on that work, we ask whether candidate drugs supported by both comorbidity and genetic similarity between Mendelian disease and cancer have greater probability for success. Among the 10 cancers in TCGA, Mendelian disease comorbidity again predicts drugs enriched for those currently investigated or indicated (odds ratio = 1.69, p = 7.42e-06, ppermutation=0.014) (Figure S14). But, combining comorbidity with genetic similarity, drugs with both forms of evidence are even more enriched for drugs with clinical support (odds ratio = 2.19, p = 6.33e-13, ppermutation=0.001) (Fig. 4A, S15).

In order to investigate the contributions of genetic similarity and comorbidity individually and combined, we stratify the 600 pairs of 60 Mendelian diseases and 10 cancers into those that are comorbid and those with no detected comorbidity relationship. As genetic similarity was not previously evaluated for non-comorbid disease pairs, we establish two measures for genetic similarity between two diseases, gene overlap and gene coexpression, similar to the measures used in Melamed, et al. [16]. (see Methods) (Supplementary Table 3). Among 314 comorbid disease pairs, 135 are also genetically similar (43%). These comorbid and genetically similar pairs greatly overlap with the ones identified by Melamed et al. [16]. (p = 7.13e-08, one-sided Fisher’s exact test), indicating that our genetic similarity metrics are consistent with the prior work. Among the remaining 286 non-comorbid pairs, 87 are genetically similar (30.4%) (Fig. 4B). The higher rate of genetic similarity among the comorbid diseases is consistent with the prior literature [16].

Using all the drugs targeting the causal genes of the 60 Mendelian diseases, we compile a list of 6,850 possible drug-cancer pairs (685 drugs x 10 cancers) (Supplementary Table S4). Among 2,727 drug-cancer pairs not supported by comorbidity, we find that those supported by genetic similarity have increased probability of drug success (odds ratio = 2.32, p = 1.07e-04). This implies that genetic similarity might be able to detect shared etiology between Mendelian disease and cancer pairs that cannot be detected with comorbidity. Further, among 4,123 drug-cancer pairs supported by comorbidity, those additionally supported by genetic similarity have greater probability of drug success (odds ratio = 1.39, p = 0.01). As we expect that candidate drug recommendations supported by comorbidity are already enriched for shared etiology, it is logical that the effect of genetic similarity would be smaller for this category of recommendations, but the effect is still significant. Notably, drug uses supported by both comorbidity and genetic similarity are most enriched for known drug uses (Fig. 4C, most left bar).

In conclusion, these findings suggest that by combining the two forms of evidence we can prioritize candidate drugs that target the shared biology between two comorbid diseases, enhancing the use of Mendelian disease biology for drug discovery.

Fig. 4
figure 4

Combination of comorbidity and genetic similarity prioritizes candidate drugs for cancers with higher probability of success. (A) Odds ratio of success for candidate drugs supported by comorbidity or comorbidity and genetic similarity between Mendelian disease and cancer. Information about genetic similarity comes from Melamed et al. [16]. Black bars represent the 95% confidence intervals. (B) Number of comorbid and genetically similar pairs among 60 Mendelian diseases and 10 cancers. Genetic similarity was estimated using two metrics established here: gene overlap and co-expression. (C) Percentage of candidate drug-cancer pairs to be currently investigated or indicated among different levels of support

Discussion

Previous studies have suggested that Mendelian disease genes pleiotropically contribute to the development of complex diseases, resulting in significantly increased risk of the complex disease in individuals with the Mendelian disease [15, 16]. However, this insight has not been harnessed for drug discovery. Here, we have shown that comorbidity between Mendelian and complex diseases can recommend candidate drugs for the complex diseases. Importantly, these candidate drugs are more likely to be in advanced drug development phases or have received regulatory approval, suggesting that Mendelian disease comorbidity can be used to prioritize drugs with high potential of eventual approval.

Our findings provide a novel way to leverage the well-known biology of Mendelian diseases to enhance the treatment of complex diseases. For instance, verapamil, an approved calcium channel inhibitor for the treatment of angina [25], is among our recommended candidate drugs for Type 1 Diabetes (T1D). This recommendation is supported by the comorbidities of T1D with Long QT Syndrome (CACNA1C) and Spinocerebellar Ataxia (CACNA1A). Studies in mice have previously demonstrated verapamil’s potential to prompt the survival of insulin-producing β-cells and reverse T1D [26]. Notably, verapamil has recently been tested in a phase III clinical trial for T1D treatment [27]. Additionally, we recommend carbamazepine, an approved sodium channel inhibitor for the control of seizures [28], as a candidate drug for the treatment of T1D based on its comorbidities with Long QT Syndrome (SCN5A) and Erythromelalgia (SCN9A). This recommendation is further supported by preclinical studies showing that inhibition of sodium channels increases the expression of INS1 and INS2 and thus protects from the development of T1D [29,30,31]. Looking ahead, we anticipate that future clinical trials should consider testing the efficacy of this drug category for preventing T1D.

We also present an approach for identifying a subset of Mendelian diseases with the most utility for drug discovery. In general, Mendelian diseases are enriched for drugged genes [14], but some Mendelian diseases appear to be targeted by even more drugs than the average. Focusing on both the Mendelian disease and gene level, we find that diseases associated with highly drugged genes hold greater promise for future drug discovery efforts.

As well, building on previous work that prioritizes drugs functionally related to disease genes [16], we explore genetic similarity as an additional way to identify diseases sharing likely pleiotropic causal genes. In an analysis of ten cancers, we find that candidate drugs supported by both comorbidity and genetic similarity between a Mendelian disease and a cancer have greater probability of success. By combining two independent sources of evidence for shared disease etiology, future research can use Mendelian disease genes to prioritize new drug uses.

Our work has some limitations. First, we could not compile a complete list of investigated drugs for the 65 complex diseases due to annotation inconsistencies. Second, the complete list of genes causally associated with a Mendelian disease might not be complete due to its rarity. Third, comorbidity may not always be due to pleiotropic effects of the Mendelian disease genes on the development of the complex diseases, but it can also be due to indirect or interaction effects. Similarly, lack of measurable comorbidity between a pair of diseases does not definitively mean an absence of shared pathological processes, but could be due to disease frequency or interaction effects. Finally, we find that the enrichment of candidate drugs for success varies across disease categories: our results were not significantly predictive for cardiovascular, ophthalmological, and hormonal diseases. This may be because we were not able to test a diverse set of diseases in these categories leading to reduced statistical power.

Conclusion

In conclusion, we leverage the well-known biology of Mendelian diseases to improve treatment of common diseases. To our knowledge, this is the first study that suggests the use of clinical associations of Mendelian diseases to inform drug discovery. Future work both exploit the drugs we suggest for each disease and explore Mendelian disease genes currently lacking drugs as novel drug targets. In fact, according to Finan et al. [32]., almost one fourth (24.4%) of the undrugged Mendelian genes, have high druggability potential. Additionally, disease comorbidity might improve other drug repurposing efforts when considered as an additional source of evidence for prioritizing drug repurposing candidates. Finally, future efforts can also build on the idea by investigating whether clinical associations between common diseases can expand the use of existing drugs.

Data availability

Code and publicly available data for reproducing the results and figures presented in this paper are stored in this GitHub repository: https://github.com/lalagkaspn/mendelian_comorbidity_therapeutics.git. Note that certain datasets and tools, such as DrugBank and the UMLS API, require a user license. In such cases, links to download the necessary files from the original sources, after obtaining the relevant license, are provided within the scripts.

References

  1. DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ. 2016;47:20–33.

    Article  PubMed  Google Scholar 

  2. Mohs RC, Greig NH. Drug discovery and development: role of basic biological research. Alzheimers Dement Transl Res Clin Interv. 2017;3(4):651–7.

    Article  Google Scholar 

  3. Wong CH, Siah KW, Lo AW, Corrigendum. Estimation of clinical trial success rates and related parameters. Biostatistics. 2019;20(2):366–366.

    Article  PubMed  Google Scholar 

  4. Reay WR, Cairns MJ. Advancing the use of genome-wide association studies for drug repurposing. Nat Rev Genet. 2021;22(10):658–71.

    Article  PubMed  CAS  Google Scholar 

  5. Duerr RH, Taylor KD, Brant SR, Rioux JD, Silverberg MS, Daly MJ, et al. A genome-wide Association Study identifies IL23R as an inflammatory bowel Disease Gene. Science. 2006;314(5804):1461–3.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  6. Khanna R, Chande N, Vermeire S, Sandborn WJ, Parker CE, Feagan BG. The Next Wave of Biological agents for the treatment of IBD: evidence from Cochrane Reviews. Inflamm Bowel Dis. 2016;22(7):1737–43.

    Article  PubMed  Google Scholar 

  7. European Medicines Agency. Stelara - ustekinumab [Internet]. European Medicines Agency; [cited 2023 Jul 13]. https://www.ema.europa.eu/en/medicines/human/EPAR/stelara

  8. Feagan BG, Sandborn WJ, Gasink C, Jacobstein D, Lang Y, Friedman JR, et al. Ustekinumab as induction and maintenance therapy for Crohn’s Disease. N Engl J Med. 2016;375(20):1946–60.

    Article  PubMed  CAS  Google Scholar 

  9. Johnson-Johnson. FDA Approves STELARA® (Ustekinumab) for Treatment of Adults With Moderately to Severely Active Crohn’s Disease [Internet]. [cited 2023 Jul 13]. https://www.jnj.com/media-center/press-releases/fda-approves-stelara-ustekinumab-for-treatment-of-adults-with-moderately-to-severely-active-crohns-disease

  10. Moschen AR, Tilg H, Raine T. IL-12, IL-23 and IL-17 in IBD: immunobiology and therapeutic targeting. Nat Rev Gastroenterol Hepatol. 2019;16(3):185–96.

    Article  PubMed  CAS  Google Scholar 

  11. Nelson MR, Tipney H, Painter JL, Shen J, Nicoletti P, Shen Y, et al. The support of human genetic evidence for approved drug indications. Nat Genet. 2015;47(8):856–60.

    Article  PubMed  CAS  Google Scholar 

  12. King EA, Davis JW, Degner JF. Are drug targets with genetic support twice as likely to be approved? Revised estimates of the impact of genetic support for drug mechanisms on the probability of drug approval. Marchini J, editor. PLOS Genet. 2019;15(12):e1008489.

  13. Schaid DJ, Chen W, Larson NB. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet. 2018;19(8):491–504.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Brinkman RR, Dubé MP, Rouleau GA, Orr AC, Samuels ME. Human monogenic disorders — a source of novel drug targets. Nat Rev Genet. 2006;7(4):249–60.

    Article  PubMed  CAS  Google Scholar 

  15. Blair DR, Lyttle CS, Mortensen JM, Bearden CF, Jensen AB, Khiabanian H, et al. A Nondegenerate Code of deleterious variants in mendelian loci contributes to Complex Disease Risk. Cell. 2013;155(1):70–80.

    Article  PubMed  CAS  Google Scholar 

  16. Melamed RD, Emmett KJ, Madubata C, Rzhetsky A, Rabadan R. Genetic similarity between cancers and comorbid mendelian diseases identifies candidate driver genes. Nat Commun. 2015;6(1):7033.

    Article  PubMed  CAS  Google Scholar 

  17. Bartsch O, Schmidt S, Richter M, Morlot S, Seemanová E, Wiebe G, et al. DNA sequencing of CREBBP demonstrates mutations in 56% of patients with Rubinstein–taybi syndrome (RSTS) and in another patient with incomplete RSTS. Hum Genet. 2005;117(5):485–93.

    Article  PubMed  CAS  Google Scholar 

  18. Miller RW, Rubinstein JH. Tumors in Rubinstein-Taybi syndrome. Am J Med Genet. 1995;56(1):112–5.

    Article  PubMed  CAS  Google Scholar 

  19. Reddy A, Zhang J, Davis NS, Moffitt AB, Love CL, Waldrop A, et al. Genetic and functional drivers of diffuse large B cell lymphoma. Cell. 2017;171(2):481–e49415.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  20. Tasneem A, Aberle L, Ananth H, Chakraborty S, Chiswell K, McCourt BJ et al. The Database for Aggregate Analysis of ClinicalTrials.gov (AACT) and Subsequent Regrouping by Clinical Specialty. Gagnier JJ, editor. PLoS ONE. 2012;7(3):e33677.

  21. Brown AS, Patel CJ. A standard database for drug repositioning. Sci Data. 2017;4(1):170029.

    Article  PubMed  PubMed Central  Google Scholar 

  22. DrugBank. Androgen receptor drugs [Internet]. DrugBank; [cited 2023 Jul 13]. https://go.drugbank.com/bio_entities/BE0000132

  23. MacNamara A, Nakic N, Amin Al Olama A, Guo C, Sieber KB, Hurle MR, et al. Network and pathway expansion of genetic disease associations identifies successful drug targets. Sci Rep. 2020;10(1):20970.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Cheng F, Desai RJ, Handy DE, Wang R, Schneeweiss S, Barabási AL, et al. Network-based approach to prediction and population-based validation of in silico drug repurposing. Nat Commun. 2018;9(1):2691.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Frishman WH. Verapamil in treatment of chronic stable angina. Arch Intern Med. 1983;143(7):1407.

    Article  PubMed  CAS  Google Scholar 

  26. Xu G, Chen J, Jing G, Shalev A. Preventing β-Cell loss and diabetes with Calcium Channel blockers. Diabetes. 2012;61(4):848–56.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  27. Forlenza GP, McVean J, Beck RW, Bauza C, Bailey R, Buckingham B, et al. Effect of Verapamil on pancreatic Beta cell function in newly diagnosed Pediatric Type 1 diabetes: a Randomized Clinical Trial. JAMA. 2023;329(12):990.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  28. Beydoun A, DuPont S, Zhou D, Matta M, Nagire V, Lagae L. Current role of carbamazepine and oxcarbazepine in the management of epilepsy. Seizure. 2020;83:251–63.

    Article  PubMed  Google Scholar 

  29. Lee JTC, Shanina I, Chu YN, Horwitz MS, Johnson JD. Carbamazepine, a beta-cell protecting drug, reduces type 1 diabetes incidence in NOD mice. Sci Rep. 2018;8(1):4588.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Szabat M, Modi H, Ramracheya R, Girbinger V, Chan F, Lee JTC, et al. High-content screening identifies a role for na + channels in insulin production. R Soc Open Sci. 2015;2(12):150306.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Overby P, Provenzano S, Nahirney NS, Dai XQ, Sun WG, Xia YH et al. Pharmacological or genetic inhibition of Scn9a protects beta-cells while reducing insulin secretion in type 1 diabetes [Internet]. Physiology; 2023 Jun [cited 2023 Dec 17]. https://doi.org/10.1101/2023.06.11.544521

  32. Finan C, Gaulton A, Kruger FA, Lumbers RT, Shah T, Engmann J, et al. The druggable genome and support for target identification and validation in drug development. Sci Transl Med. 2017;9(383):eaag1166.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the National Institute of General Medicine Sciences (NIGMS R35 GM151001-01) and the National Institute of Environmental Health Sciences (NIEHS K01 ES028055-05) with grants to RDM.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: RDM. Data curation and formal analysis: PNL. Writing original draft, review, and editing: PNL, RDM. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Rachel D. Melamed.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1. Table S1

. Recommended candidate drugs for repurposing for each complex disease based on its comorbidity with Mendelian diseases.

Supplementary Material 2. Table S2

. Ranked list of recommended candidate drugs for repurposing for each complex disease.

Supplementary Material 3. Table S3

. Genetic similarity measurements for all Mendelian disease - cancer pairs, including comorbid and non-comorbid pairs.

Supplementary Material 4. Table S4

. Recommended candidate drugs for repurposing for 10 cancers supported by different levels of comorbidity and genetic similarity between cancers and Mendelian diseases.

Supplementary Material 5

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lalagkas, P.N., Melamed, R.D. Shared etiology of Mendelian and complex disease supports drug discovery. BMC Med Genomics 17, 228 (2024). https://doi.org/10.1186/s12920-024-01988-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12920-024-01988-3

Keywords