Skip to main content

Inferring drug-disease associations based on known protein complexes


Inferring drug-disease associations is critical in unveiling disease mechanisms, as well as discovering novel functions of available drugs, or drug repositioning. Previous work is primarily based on drug-gene-disease relationship, which throws away many important information since genes execute their functions through interacting others. To overcome this issue, we propose a novel methodology that discover the drug-disease association based on protein complexes. Firstly, the integrated heterogeneous network consisting of drugs, protein complexes, and disease are constructed, where we assign weights to the drug-disease association by using probability. Then, from the tripartite network, we get the indirect weighted relationships between drugs and diseases. The larger the weight, the higher the reliability of the correlation. We apply our method to mental disorders and hypertension, and validate the result by using comparative toxicogenomics database. Our ranked results can be directly reinforced by existing biomedical literature, suggesting that our proposed method obtains higher specificity and sensitivity. The proposed method offers new insight into drug-disease discovery. Our method is publicly available at


Diseases are often caused by congenital disorder or expression of abnormal genes, which induces multi-factor-driven alterations and disrupts functional modules [1]. Drugs accomplish their therapeutic effect by changing downstream processes of their targets, which contend with the alterations of the abnormal genes. Drug development is expensive, time consuming and has a high risk of failures. By conservative estimates, it now takes ~15 years [2] and $800 ~ $1000 million to bring a single drug to market [3]. This situation hampers the pharmaceutical industry to find innovative strategies against currently incurable diseases. Drug repositioning (or drug repurposing) attempts to find previously unknown targets for drugs already established on the market or drugs currently in advanced development stages. Several examples throughout history have shown that such repositioning can be very successful (one example is Sildenafil, also known as Viagra) [4]. Therefore, more and more research is focusing on inferring drug-disease associations by computational methods.

Several network-based methods have been studied to infer the relationships between drugs and disease (for a review, see [5]). Matteo indicated that the combination of bipartite network projections, weighted integration of different pharmacological spaces and kernelized score functions with random walk kernels play a key role in significantly improving the drug ranking results with respect to DrugBank therapeutic categories [6]. Cheng [7] integrated three networks, chemical, gene and disease, to infer chemical hazard profiles, identify exposure data gaps, and incorporate genes and disease networks into chemical safety evaluations. Lee established a database PharmDB, an integrated tripartite database, coupled with Shared Neighborhood Scoring (SNS) algorithm, to find new indication of known drugs [8]. With increasing evidence in genetic and molecular biology, we know that protein complexes and pathways are not affected by a single gene, instead a group of interacting genes underlying similar diseases, which point out the therapeutic importance of those modules [9]. Therefore, it is of great importance to investigate how drugs and disease phenotypes are associated on the basis of gene modules [10]. In 2004, different tumor types were tentatively characterized by predefined gene modules using gene expression data [11]. Wong et al. defined a module map to connect gene modules with human cancers, which was shown to guide new disease therapies [12]. PREDICT is based on the observation that similar drugs are indicated for similar diseases, and utilizes multiple drug-drug and disease-disease similarity measures for the prediction task [13]. It allows easy integration of additional similarity measures among diseases and drugs. In 2012, Daminelli constructed a drug-target-disease network and extracted the bi-cliques where every drug is linked to every target and disease [14]. This method can reposition drugs and predict a drug's off targets simultaneously. Ye integrated known drug target information and proposed a disease-oriented strategy for evaluating the relationships between drugs and specific diseases based on their pathway profile [15]. Zhao et al. developed a Bayesian partition method to discover drug-gene-disease co-modules. Such a co-module approach offered a systematic and holistic view to study drug-disease relationships and their molecular basis [16]. A huge amount of chemical, genomic and disease phenotype data is rapidly accumulated, but the drug-diseases associations are still not clear.

Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. CORUM provides a comprehensive dataset of protein complexes for discoveries in systems biology, analyses of protein networks and protein complex-associated diseases [17]. Therefore, based on the known complexes in CORUM database, we design a method to infer drug-complex-disease phenotype relationships using a network model, where protein complexes are related to not only drugs but also to the disease phenotype.

In our study, based on a symmetrical conditional probability model, we construct a weighted tripartite hetero-network of drugs, protein complexes, and diseases. From this drug-complex-disease tripartite network, we are able to obtain indirect weighted relationships between drugs and diseases, which is a bipartite hetero-network. A drug which has high correlation with a complex set receives a higher closeness score with disease, which also highly related to the same complex set. We rank the associations between drugs and diseases in descending order, by edge weights, in drug-disease network. The larger the weight of the association, the greater the degree of reliability, thus the greater the possibility of relation of drug to disease. We select mental disorders and hypertension as our test data. We use the both curated and inferred drug-disease associations from Comparative Toxicogenomics Database (CTD;[18]as our benchmark. Our ranked results show that our proposed method obtain higher specificity and sensitivity. Our approach renders a promising perspective to investigate drug-disease associations and provides computational evidence in revealing their mechanism basis.

Materials and methods

The integrated network, including three heterogeneous data of drug, disease, protein complex are illustrated in Figure 1.

Figure 1
figure 1

The overview of our proposed method. Firstly, we construct a drug-complex network. If the target set of a drug has at least one common protein with a complex, there will be an edge between the drug and the complex. Then, we construct a complex-disease network. If there is an edge between a complex and a disease, at least one protein of the complex is also a protein related to the disease. In this way, we get a drug-complex-disease tripartite network. Based on the tripartite, we can extract the associations between drugs and diseases. If a drug and a disease have at least one common protein complex neighbor, there will be a connection between them.


Data sources

Drug data

The DrugBank database combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information [19, 20]. We collect FDA-approved drugs in the latest release of DrugBank database (version 4.0) [21].

Protein complexes data

The CORUM database is a comprehensive resource of manually annotated protein complexes from mammalian organisms. All the information is obtained from individual experiments published in scientific articles, and data from high-throughput experiments are excluded. We download the all Complexes from CORUM [17](the release February 2012).

Disease data

The disease data is downloaded from FunDO ( [22]. FunDO takes a list of genes and finds relevant diseases based on statistical analysis of the Disease Ontology annotation database [23].

Protein-protein interaction network

We obtain relationships between genes (or equivalently, proteins) as demonstrated by Liu et al. [24]. The final binary protein-protein interaction network contains 7,533 nodes and 22, 345 edges. Genes are identified by their NCBI gene IDs. We use the PPI network to filter the predicted drug-disease associations. If a drug and a disease are associated with two different genes in a same complex, and there is a direct connection between the two genes in the PPI network, we will track the association, or else we discard it.

Benchmark of drug-disease associations

We extract all the known associations between chemicals (or equivalently, drugs) and disorders or its descendants from Comparative Toxicogenomics Database (CTD) in May 2014 as our benchmark [25]. CTD contains curated and inferred chemical-disease associations. Curated chemical-disease associations are extracted from the published literature by CTD biocurators. Inferred associations are established via CTD-curated chemical-gene interactions. In our research the curated and inferred associations have been identified, and they can help researchers develop hypotheses about environmental diseases and their underlying mechanisms.

Functional enrichment analysis

In order to evaluate our method further, we perform functional enrichment analysis using DAVID [26, 27] on the target sets of predicted drugs. With the target genes as inputs, we observe gene-disease associations and the enriched KEGG pathways on the related biological process. With Benjamin multiple testing correction method [28], the enrichment p-value was corrected to control family-wide false discovery rate under certain rate (e.g. ≥ 0.05).


Weighted network construction

To construct a weighted tripartite network of drugs, protein complexes, and diseases, we map the UniProt ID of each drug target to the Entrez gene ID. We obtain a list of gene targets for each drug. There are 6,039 relations between 1,481 drugs and 1,583 targets (additional file 1). We collect the list of protein subunits for each complex in the all Complexes set, which are referenced by their Entrez IDs (additional file 2). The same operation is conducted for all genes related to diseases, resulting in a list of Entrez gene identifiers for each disease (additional file 3). The relations between drugs, protein complexes, and diseases can be represented as a tripartite network, which can be expressed as:

G TPD = ( T , P , D , E T , E D )

T, P, and D are finite sets of drug, protein complex, and disease; E T and E D denote the two types of undirected links in the network: drug-complex and complex-disease. The relevance between drug t i (t i T, i= 1,...,|T|) and complex p j (p j P, j= 1,...,|P| ), w T (t i , p j ) is calculated by symmetrical conditional probability, as in equation (2).

w T ( t i , p j ) = pro ( t i | p ) pro ( p | t i )

Equation (2) indicates that the relevance between t i and p j is determined jointly by their conditional probabilities on each other.

Suppose that g(t i , p j ) denotes the number of elements shared by the target set of the drug t i and the complex set p j , g(t i ) and g(p j ) stand for the number of targets of the drug t i and the number of proteins in complex p j respectively. Accordingly, equation (2) can be expressed as:

w T ( t i , p j ) = g ( t i , p ) g ( t i ) g ( t i , p j ) g ( p j )

Similarly, we can obtain the weight w D (p i , d j ) (p i P, d j D, i= 1,...,|P|, j= 1,...,|D|) to the links between complexes and diseases. (p i , d j ) E D if at least one protein of the complex p i is also a protein related to the disease d j , where p i P, d j D, i= 1,...,|P|, j= 1,...,|D|.

Derivative Network

To identify the drug-disease association, a derived drug-disease network can be extract with an immediate purpose to facilitate the association identification. A bipartite network G TD = (T, D, E TD ) is used to illustrate their associations, where T, D are finite sets of drug and disease respectively. E TD denotes the undirected links between drugs and diseases. The drug-disease interaction exists if and only if the following two constraints are met simultaneously: i) the drug and the disease have at least one common protein complex neighbor in G TPD network; ii) at least one protein target of the drug was also a subunit of the protein complex. Specifically, it is defined as

E TD = { ( t , d ) | ( p P ) ( ( t , p ) E T ( p , d ) E D ) t T d D }

where P is the set of protein complexes. For each edge (t, d) E TD , its weight w TD (t, d) can be calculated by equation (5):

w TD ( t , d ) = g T ( t , C ) g ( t ) g D ( d , C ) g ( d )

Suppose C represents the set of protein complexes that both drug t and disease d connect to in G TPD network, then:

C = { p | p P ( t , p ) E T ( p , d ) E D t T d D }

g T (t, C) represents the sum of edge weights between drug t and protein complexes in set C. The formulas of g T (t, C) and g D (d, C) are given as follows:

g T ( t , C ) = p C w T ( t , p )
g D ( d , C ) = p C w D ( p , d )

g(t) and g(d) in equation (5) respectively indicate the sum of edge weights between drug t, disease d and protein complexes in set P. Therefore:

g ( t ) = p P w T ( t , p )
g ( d ) = p P w D ( p , d )

If drug t' and disease d' cannot be connected by common complex neighbors, but at least one protein target of drug t' is also a protein related to disease d', a connection will be created between t' and d'. Similarly, the weight of edge (t', d') can be calculated by equation (3).

Network conversions

In order to verify the predicted drug-disease correlations by modularity, we first need to convert G TD into two networks. Each converted network is composed of a single type of node. The bipartite network for drugs and diseases G TD is converted into two independent networks, which are denoted by G1 = (V1,E1) and G2 = (V2,E2). G1 and G2 are the drugs and the diseases networks respectively. In G1, nodes of V1 are connected together if they have at least one common neighbor (D) in G TD . The set of edges E1 can be defined as:

E 1 = { ( t , t ) | ( d D ) ( ( t , d ) E TD ( t , d ) E TD t t ) }

The set of edges E2 is defined similarly. The weight of edge (t, t') E1, w(t, t') is defined as:

w ( t , t ) = d D min ( w TD ( t , d ) , w TD ( t , d ) )

Edge weights in G2 have a similar definition. Therefore, we get two weighted networks: a drug-drug network and a disease-disease network.

Module structure in converted network

We use ClusterONE (Clustering with Overlapping Neighborhood Expansion) [29] to obtain modules in converted networks. ClusterONE is a graph clustering algorithm that is able to handle weighted graphs. Owing to these properties, ClusterONE is especially useful for detecting modules in networks with associated confidence values.


Bipartite network of drugs and diseases

The weighted tripartite network of drug-complex-disease consists of two bipartite networks: drug-complex and complex-disease. The drug-complex network contains 1,229 nodes (628 drugs and 601 complexes) and 3,405 weighted edges (additional file 4). The complex-disease network contains 1932 nodes (1,472 complexes and 460 diseases) and 14,848 weighted edges (additional file 5). The bipartite network of drug-disease obtained from the tripartite network includes 1,634 nodes (1,127 drugs and 507 diseases) and 30,722 weighted edges (additional file 6). In order to improve the reliability of the predicted correlations between drugs and diseases, we first use PPI network to filter the results, then we discard the edges whose weights are lower than 0.50. The final network consists of 353 nodes (231 drugs and 122 diseases) and 594 weighted edges (weight≥ 0.50) (additional file 7). This is a scale-free network, with a small number of nodes connected to many edges and the majority of nodes connected to few edges (Figure 2).

Figure 2
figure 2

Bipartite network of drugs and diseases. A drug is connected to a disease if they share at least one complex and the value of relationship is not lower than 0.5. Drugs are represented by triangles and diseases by squares. Different types of nodes also distinguish from each other by color. Every connected subgraph is a module. Drugs and diseases are labeled by their DrugBank identifier and name in FunDO, respectively.

All network visualizations were produced using the Cytoscape software [30]. Every connected subgraph represents a module, resulting in 29 modules with bipartite structure as shown in Figure 2. Nodes with a large degree can be seen among both drugs and diseases (See Table 1).

Table 1 Top diseases and drugs with a large degree in the bipartite drug-disease network

Table 1 shows the number of edges directly related to the hubs (column: Number of directed edges) and the sum of weight on these edges (column: Sum of weight on edges). We find that the sum of weights of edges may more accurately reflect the role of nodes in the network. For example, cystic fibrosis has more direct neighbors than primary biliary cirrhosis in bipartite network. But, the correlation between the drugs and primary biliary cirrhosis is greater than that between the drugs and cystic fibrosis. In Table 1 the most connected disease is mental disorders (Synonym: behavior disease), which is a mental or behavioral pattern, or an anomaly that causes either suffering or an impaired ability to function in ordinary life (disability). The most connected drug is anti-thymocyte globulin (ATG). It is an infusion of horse or rabbit-derived antibodies against human T cells, which is used in the prevention and treatment of acute rejection in organ transplantation and therapy of aplastic anemia.

Case study: Mental Disorders

Potential drugs and Mental Disorders relations

Mental disorders are one aspect of mental health [31], which are generally defined by a combination of how a person feels, acts, thinks and perceives. This may be associated with particular regions or functions of the brain, or any part of the nervous system, often in a social context. 226 drug-mental disorders relations are found in our candidate sets (additional file 8). In order to improve the accuracy of the prediction, an association will not be considered if its weight is below 0.5. The reason is that based on the experiments, 0.5 as threshold can conserve more real correlations, as well as avoid including too many false-positive ones. Finally, 51 drug-mental disorders correlations are obtained (see Table 2).

Table 2 Drug-mental disorders associations (weight ≥ 0.5)

Since the predictions are merely assumptions, we need to further examine these predictions using external literature support: 40 known associations agree with the benchmark (CTD), 9 predicted associations are supported by the literature (in bold italic). We find the 9 predicted drugs for the treatment of mental disorders may have a good effect. For example, vilazodone [32] (ID = 30) is approved for treatment of acute episodes of major depression. Major depressive disorder (MDD) is a mental disorders characterized by a pervasive and persistent low mood that is accompanied by low self-esteem and by a loss of interest or pleasure in normally enjoyable activities.

Pipotiazine (ID = 31) is a typical antipsychotic of the phenothiazine class [33] used in the United Kingdom and other countries for the treatment of schizophrenia. Thioproperazine (ID = 35) is an antipsychotic. Antipsychotics [34] are a class of psychiatric medication primarily used to manage psychosis, in and concentration [35, 36]. Certain mental health problems, such as depression and disturbances, including hallucinations, delusions and paranoia, are possible complications of Parkinson's disease and/or its treatment. Rotigotine (ID = 43) is for treatment in neurologic disorders and Parkinson's disease, as well as moderate-to-severe primary Restless Legs Syndrome [37]. Paliperidone (ID = 44) is the major active metabolite of risperidone. It is used for schizophrenia and schizoaffective cinitapride (ID = 19) and penbutolol (ID = 45), there is no direct support in literature. However, we are confident that they maybe effective in the treatment of mental disorders. Cinitapride is a substituted benzamide with 5-HT receptor antagonist and agonist activity [38]. The 5-HT receptors are the target of a variety of pharmaceutical drugs, including many antidepressants, antipsychotics, etc [39], so cinitapride may be effective in the treatment of mental disorders. Similarly, penbutolol is able to bind both β-1 adrenergic receptors (ARs) and β-2 adrenergic receptors [40], and the interaction between β-1 ARs and testosterone has been shown in anxiolytic behaviors in the basolateral amygdale [41]. β-2 receptor is also involved in brain-immune-communication [42]. Therefore, we can conclude that penbutolol has a high correlation with mental disorders.

The significant modules related to mental disorders in drug-drug network

Modular structure is one of the emerging properties of complex networks. A module is associated to sets of nodes with specific function. In order to further validate the effectiveness of our algorithm, we run ClusterONE with parameter Minimum density set to 0.35 and other parameters using default values in drug-drug network. We get 23 clusters from drug-drug network (additional file 9); nodes representing drugs. All drugs associated with mental disorders are scattered into two overlapping modules (cluster 1 and cluster 3, i.e. Cluster Label = 1 and Cluster Label = 3 in additional file 9). To analyze drugs associated with mental disorders, we merge these two modules (shown in Figure 3). Diamonds represent overlapping drugs of cluster 1 and cluster 3. In Figure 3, drugs colored pink have been shown to be associated with mental disorders by the benchmark (CTD). Purple nodes are drugs predicted by our method. They are listed in Table 2, and their correlations with mental disorders are not lower than 0.5 in drug-disease network. They are closely linked with known drugs (pink nodes), which further confirms that they have a high functional similarity with known drugs. That is, the 11 predicted drugs also have a strong association with mental disorders. The 3 green nodes are new predicted drugs by clustering the drug-drug network. They are also closely connected with known drugs, and are supported by literature. For example, dexmethylphenidate (DB06701) is used as a treatment for Attention Deficit Hyperactivity Disorder (ADHD), ideally in conjunction with psychological, educational, behavioral or other forms of treatment [43] Levomilnacipran (DB08918) is an antidepressant developed by Forest Laboratories and Pierre Fabre Group for the treatment of depression [4446]. For ephedra (DB01363), studies have shown that it may cause serious mental illness [47]. Maglione et al. reviewed all 1,820 adverse event reports related to dietary supplements containing herbal ephedra from FDA MedWatch files as of Sept. 30, 2001. Fifty-seven serious psychiatric events were reported. Therefore, clinicians should be aware that serious psychiatric symptoms could be associated with ephedra use.

Figure 3
figure 3

Drugs associated with mental disorder within the module after merging cluster 1 and cluster 3. Nodes represent drugs. Diamond nodes represent the overlap of cluster 1 and cluster 3. Nodes colored pink represent drugs that have been shown to be associated with mental disorder by the benchmark (CTD). Purple nodes represent drugs predicted by our method. Green nodes are newly predicted drugs related to mental disorder. Drugs are labeled by their DrugBank identifiers.

Functional enrichment analysis on target genes of potential drugs of mental disorder

Functional analysis are performed on the target sets of eleven drugs, which are not approved by CTD (see Table 2, drugs in bold italic and underlined bold). Gene-disease associations and KEGG pathway enrichment analysis are made on them with the functional annotation tool of DAVID. We find ten target sets of them are directly associated with mental disorder or the same type of diseases, such as depressive disorder, and personality disorders. In addition, the same ten target sets of drugs are significantly enriched in the mental disorder related pathways: neuroactive ligand-receptor interaction. Adkins et al. systematically screened associations between 58 neuroactive ligand-receptor interaction pathways and antipsychotic treatment efficacy by bioinformatics tools [48]. The target set of vilazodone (Drug ID=DB06684) is not obtained annotations from DAVID. We infer the reason is that the set only includes one gene (HTR1A). In fact, vilazodone is already approved for treatment of acute episodes of major depression [32].

Case study: Hypertension

Potential drugs and Hypertension relations

Hypertension, also referred to as high blood pressure, is a condition in which the arteries have persistently elevated blood pressure. A blood pressure of 140/90 or above is considered hypertension. Hypertension can lead to damaged organs, as well as several illnesses, such as renal failure (kidney failure), aneurysm, heart failure, stroke, or heart attack [49].

We find 339 drug-hypertension relations in our candidate sets in all (additional file 10). 69.3% of the weight is less than 0.1, and there are 31 associations with high confidence (weight≥ 0.5, see Table 3). Among them, 26 known associations agree with the benchmark (CTD). Through in-depth analysis of the other 5 associations (in bold italic), there are two types of correlation between diseases and drugs: positive and negative correlations. Positive correlations refer to the positive effect of drugs on diseases. For example, drugs can treat diseases. Negative correlations, for example, are that drugs can cause diseases, namely, side effects of drugs, or drugs that worsen diseases, etc. Both are very important in discovering the causes of a disease or in using drugs safely, so that we can treat diseases more effectively. Using SIDER (Side Effect Resource, [50], we find asenapine (ID = 1) has the side effect of hypertension [50]. For trimipramine (ID = 29) and paliperidone (ID = 31), although there is no clear evidence showing they have side effect of hypertension, there have been some indications that they are likely to lead to high blood pressure [50, 51]. Mehtysergide (ID = 20) is metabolised into methylergometrine in humans [52]. Adverse effects of methylergometrine include cholinergic effects, pulmonary hypertension, and severe systemic hypertension, etc [53]. The last drug, iloperidone (ID = 30), plays an active role in the treatment of hypertension. Considering the alpha1 antagonism characteristics of iloperidone, the effect of anti-hypertensive agents would be potentiated when administered concomitantly [54]. This shows that iloperidone has certain effects on lower blood pressure.

Table 3 Drug-hypertension associations (weight ≥ 0.5)

The significant modules related to hypertension in drug-drug network

Of the 23 drug modules, 11 are found to be related to hypertension. Five predicted drugs (purple rectangle nodes: DB06216, DB00247, DB00726, DB04946, DB01267) are in the same cluster (Figure 4). They are listed in Table 2 and their associations with hypertension is not lower than 0.5. The pink circular nodes have been confirmed to be associated with hypertension by CTD. It can be seen that the interactions between the five predicted drugs and the known drugs are very frequent. These results further indicate that they are highly correlated with hypertension. In addition, twenty-six nodes in Figure 4 are shown in Table 4. They includes two types of drugs: (1) predicted by our method, but their association with hypertension is lower than 0.5; (2) new drugs predicted by clustering drug-drug network. The first sixteen drugs (ID = 1 to ID = 16) were predicted by our method previously. The remaining ten drugs (ID = 17 to ID = 26) are newly predicted by clustering drug-drug network. They have high accuracy: nine of them are approved by CTD database (Correlation=CTD, see Table 4); one is supported by literature [55] (ephedra (ID = 17)). Ephedra containing products (ECPs), which are most often found in sources of caffeine alkaloids, may be an under-recognized cause of hypertension. For the previously predicted drugs with lower weights (ID = 1 to ID = 16), seven of them may cause high blood pressure, and are negatively correlated with hypertension (Correlation = N, see Table 4). Milnacipran (ID = 3) for example, researchers presented the case of a patient with major depressive disorder (MDD) who developed hypertension during treatment with regular therapeutic doses of milnacipran [56]. Desvenlafaxine (ID = 6) is similar to venlafaxine, its use may worsen preexisting hypertension [57]. For the remaining eight drugs, there are no evidence suggesting drug-hypertension relations. From the results, we derived two indications: 1) as a metric, our definition of weight is reasonable in assessing the credibility of drug-disease correlation - the greater the degree of reliability, the larger the weight, while the smaller the weight, the lower the reliability; 2) combined with modularity in projected network, our method is very effective in predicting drug-disease associations.

Figure 4
figure 4

Drugs associated with hypertension. Nodes represent drugs. Circular nodes represent drugs that have been shown to be associated with hypertension by the benchmark (CTD). Rectangle and diamond nodes respectively represent our predicted drugs whose relationship with hypertension are higher than 0.5 and lower than 0.5. Drugs are labeled by their DrugBank identifiers.

Table 4 Correlations of twenty-six Drugs with hypertension

Functional enrichment analysis on target genes of potential drugs of hypertension

There are five drugs predicted by our method, but not approved by the benchmark (see Table 3, drugs in bold italic and underlined bold). We perform the gene-disease associations and KEGG pathway enrichment analysis on their target sets with DAVID. The enrichment result thus obtained show that three target sets of them are directly associated with hypertension. But all of them are significantly enriched in the hypertension related pathway, such as gap junction. It is instructive to note that the gap junction has been proved to be relevant to hypertension [58].

Comparison with other method

To evaluate the performance of our method, we compare it with a popular web tool, PROMISCUOUS [59]. PROMISCUOUS contains three different types of entities: drugs, proteins and side-effects as well as relations between them. It is kind of knowledge-based drug repositioning method, which offers exploits known interactions between a drug and a target and combine this information with new knowledge about the target's role in a new indication.

We compare our method and PROMISCUOUS on eleven potential drugs of mental disorder one by one. They are shown in Table 2 (drugs in bold italic and underlined bold). By experimentation, five of them, pipotiazine, thioproperazine, acepromazine, ergoloid mesylate and paliperidone, are found to be antipsychotic medications by PROMISCUOUS, which are consistent with our prediction. Penbutolol (ID = 45) and bopindolol (ID = 39) are not shown associated with the treatments of mental disorders directly by PROMISCUOUS. However, for penbutolol, based on the fact that similar drugs often act on the same targets, PROMISCUOUS finds eight drugs similar to it. One of them is pemoline, which is a kind of antipsychotic drugs. Moreover, PROMISCUOUS also find penbutolol and bopindolol are related to KEGG pathways: neuroactive ligand-receptor interaction, which is proved associated with antipsychotic treatment [48]. Therefore, one can assume that penbutolol and bopindolol may also be effective for treatment of mental disorders. Because PROMISCUOUS integrated multiple public database, such as Drugbank, Protein Data Bank, KEGG, UniProt, SIDER, etc., the comparative results show the validity of our algorithm from another side. The last four drugs, cinitapride, vilazodone, iloperidone and rotigotine, are not found closely related to mental disorders by PROMISCUOUS. But with the exception of cinitapride (ID = 19), the other three drugs are all directly supported by the literatures.

A comparison also is made between PROMISCUOUS and our method on five potential drugs of hypertension (see Table 3, drugs in bold italic and underlined bold). Among the five drugs, PROMISCUOUS finds methysergide and paliperidone related to gap junction pathway, which is supported to be associated with hypertension [58]. The other three drugs, asenapine, trimipramine and iloperidone, are not found by PROMISCUOUS. More likely the reason is that they may have the side effect of hypertension. This is also consistent with our inference.


We integrate the information of drugs, protein complexes and diseases from available experimental data and knowledge as weighted drug-complex-disease tripartite networks and obtain a derived connected relationships network, i.e. drug-disease bipartite network. One of the advantages of our model is its relative simplicity. It is not like other existing algorithms that first need to construct drug and disease similarity networks. With protein complexes as the bridge, we apply drug-complex-disease approach for inferring and evaluating the likelihood of the probability between drugs and diseases. In our simulation experiment, we take mental disorders and hypertension as our case study. The results of the experiment are encouraging. Both the positive and negative associations can be predicted and are found to be reinforced by existing biomedical literature. The success of our methods can be attributed to the following factors: first, we integrate heterogeneous data and knowledge about drugs, protein complexes, and diseases into our model; next, we use symmetric probability modelling dependencies between drugs, protein complexes, and diseases; last, our method combines the information derived from other connected hetero-networks to infer the drug-disease associations. We believe that the integration of networks and heterogeneous data sources will help us bring about new hypotheses to infer the drug-disease associations and even speed up drug development processes. Our study provides opportunities for future toxicogenomics and drug discovery applications. However, we find that it is difficult to automatically distinguish the positive and negative associations between drug and disease. For the next step, we suggest: 1) for commonly used data, such as drugs, targets, protein complexes, and diseases, we need to integrate data sources with higher confidence to improve the accuracy of the prediction; 2) in order to predict the positive and negative associations automatically as much as possible, we need to integrate data sources that can offer information about the side effects of drugs, such as drug side effect resources, response profiles, pharmacological data and therapeutic/toxicological expression profiles.


  1. Goh Kwang-Il, Cusick Michael, Valle David, Childs Barton, Vidal Marc, Barabási Albert-László: The human disease network. Proceedings of the National Academy of Sciences. 2007, 104: 8685-8690. 10.1073/pnas.0701361104.

    Article  CAS  Google Scholar 

  2. DiMasi Joseph: New drug development in the United States from 1963 to 1999. Clinical pharmacology and therapeutics. 2001, 69: 286-296. 10.1067/mcp.2001.115132.

    Article  CAS  PubMed  Google Scholar 

  3. Adams Christopher, Van Brantner V: Estimating the cost of new drug development: is it really $802 million?. Health Affairs. 2006, 25: 420-428. 10.1377/hlthaff.25.2.420.

    Article  PubMed  Google Scholar 

  4. von Eichborn Joachim, Murgueitio Manuela, Dunkel Mathias, Koerner Soeren, Bourne Philip, Preissner Robert: PROMISCUOUS: a database for network-based drug-repositioning. Nucleic acids research. 2011, 39: D1060-D1066. 10.1093/nar/gkq1037.

    Article  CAS  PubMed  Google Scholar 

  5. Wu Zikai, Wang Yong, Chen Luonan: Network-based drug repositioning. Molecular BioSystems. 2013, 9: 1268-1281. 10.1039/c3mb25382a.

    Article  CAS  PubMed  Google Scholar 

  6. Re Matteo, Valentini Giorgio: Network-based drug ranking and repositioning with respect to DrugBank therapeutic categories. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB). 2013, 10: 1359-1371.

    Article  Google Scholar 

  7. Cheng Feixiong, Li Weihua, Zhou Yadi, Li Jie, Shen Jie, Lee Philip, et al: Prediction of human genes and diseases targeted by xenobiotics using predictive toxicogenomic-derived models (PTDMs). Molecular BioSystems. 2013, 9: 1316-1325. 10.1039/c3mb25309k.

    Article  CAS  PubMed  Google Scholar 

  8. Lee Hee, Bae Taejeong, Lee Ji-Hyun, Kim Dae, Oh Young, Jang Yeongjun, et al: Rational drug repositioning guided by an integrated pharmacological network of protein, disease and drug. BMC systems biology. 2012, 6: 80-10.1186/1752-0509-6-80.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Suthram Silpa, Dudley Joel, Chiang Annie, Chen Rong, Hastie Trevor, Butte Atul: Network-based elucidation of human disease similarities reveals common functional modules enriched for pluripotent drug targets. PLoS computational biology. 2010, 6: e1000662-10.1371/journal.pcbi.1000662.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Schadt Eric, Friend Stephen, Shaywitz David: A network view of disease and compound screening. Nature reviews Drug discovery. 2009, 8: 286-295. 10.1038/nrd2826.

    Article  CAS  PubMed  Google Scholar 

  11. Segal Eran, Friedman Nir, Koller Daphne, Regev Aviv: A module map showing conditional activity of expression modules in cancer. Nature genetics. 2004, 36: 1090-1098. 10.1038/ng1434.

    Article  CAS  PubMed  Google Scholar 

  12. Wong David, Nuyten Dimitry, Regev Aviv, Lin Meihong, Adler Adam, Segal Eran, et al: Revealing targeted therapy for human cancer by gene module maps. Cancer research. 2008, 68: 369-378. 10.1158/0008-5472.CAN-07-0382.

    Article  CAS  PubMed  Google Scholar 

  13. Gottlieb Assaf, Stein Gideon, Ruppin Eytan, Sharan Roded: PREDICT: a method for inferring novel drug indications with application to personalized medicine. Molecular systems biology. 2011, 7:

    Google Scholar 

  14. Daminelli Simone, Haupt Joachim, Reimann Matthias, Schroeder Michael: Drug repositioning through incomplete bi-cliques in an integrated drug-target-disease network. Integrative Biology. 2012, 4: 778-788. 10.1039/c2ib00154c.

    Article  CAS  PubMed  Google Scholar 

  15. Ye Hao, Yang LinLin, Cao ZhiWei, Tang KaiLin, Li YiXue: A pathway profile-based method for drug repositioning. Chinese Science Bulletin. 2012, 57: 2106-2112. 10.1007/s11434-012-4982-9.

    Article  Google Scholar 

  16. Zhao Shiwen, Li Shao: A Co-module Approach for Elucidating Drug-Disease Associations and Revealing Their Molecular Basis. Bioinformatics. 2012, 28 (7): 955-961. 10.1093/bioinformatics/bts057.

    Article  CAS  PubMed  Google Scholar 

  17. Ruepp Andreas, Waegele Brigitte, Lechner Martin, Brauner Barbara, Dunger-Kaltenbach Irmtraud, Fobo Gisela, et al: CORUM: the comprehensive resource of mammalian protein complexes--2009. Nucleic acids research. 2009, gkp914-

    Google Scholar 

  18. Mattingly CJ, Rosenstein MC, Colby GT, Forrest JN, Boyer JL: The Comparative Toxicogenomics Database (CTD): a resource for comparative toxicological studies. Journal of Experimental Zoology Part A: Comparative Experimental Biology. 2006, 305: 689-692.

    Article  CAS  Google Scholar 

  19. Wishart David, Knox Craig, Guo Chi An, Shrivastava Savita, Hassanali Murtaza, Stothard Paul, et al: DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic acids research. 2006, 34: D668-D672. 10.1093/nar/gkj067.

    Article  CAS  PubMed  Google Scholar 

  20. Wishart David, Knox Craig, Guo Chi An, Cheng Dean, Shrivastava Savita, Tzur Dan, et al: DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic acids research. 2008, 36: D901-D906.

    Article  CAS  PubMed  Google Scholar 

  21. Law Vivian, Knox Craig, Djoumbou Yannick, Jewison Tim, Guo Chi An, Liu Yifeng, et al: DrugBank 4.0: shedding new light on drug metabolism. Nucleic acids research. 2014, 42: D1091-D1097. 10.1093/nar/gkt1068.

    Article  CAS  PubMed  Google Scholar 

  22. Osborne John, Flatow Jared, Holko Michelle, Lin Simon, Kibbe Warren, Zhu Lihua, et al: Annotating the human genome with Disease Ontology. BMC genomics. 2009, 10: S6-

    Article  PubMed  PubMed Central  Google Scholar 

  23. Du Pan, Feng Gang, Flatow Jared, Song Jie, Holko Michelle, Kibbe Warren, et al: From disease ontology to disease-ontology lite: statistical methods to adapt a general-purpose ontology for the test of gene-ontology associations. Bioinformatics. 2009, 25: i63-i68. 10.1093/bioinformatics/btp193.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Liu Zhi-Ping, Wang Yong, Zhang Xiang-Sun, Xia Weiming, Chen Luonan: Detecting and analyzing differentially activated pathways in brain regions of Alzheimer's disease patients. Molecular BioSystems. 2011, 7: 1441-1452. 10.1039/c0mb00325e.

    Article  CAS  PubMed  Google Scholar 

  25. Davis Peter Allan, Wiegers Thomas, Roberts Phoebe, King Benjamin, Lay Jean, Lennon-Hopkins Kelley, et al: A CTD-Pfizer collaboration: manual curation of 88 000 scientific articles text mined for drug-disease and drug-phenotype interactions. Database. 2013, 2013: bat080-10.1093/database/bat080.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Huang DW, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols. 2008, 4: 44-57. 10.1038/nprot.2008.211.

    Article  Google Scholar 

  27. Huang DW, Sherman BT, Lempicki RA: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Research. 2009, 37: 1-13. 10.1093/nar/gkn923.

    Article  Google Scholar 

  28. Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc B. 1995, 57: 289-300.

    Google Scholar 

  29. Nepusz Tamás, Yu Haiyuan, Paccanaro Alberto: Detecting overlapping protein complexes in protein-protein interaction networks. Nature methods. 2012, 9: 471-472. 10.1038/nmeth.1938.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Smoot Michael, Ono Keiichiro, Ruscheinski Johannes, Wang Peng-Liang, Ideker Trey: Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics. 2011, 27: 431-432. 10.1093/bioinformatics/btq675.

    Article  CAS  PubMed  Google Scholar 

  31. Stein Dan: In Review What Is a Mental Disorder? A Perspective From Cognitive-Affective Science. Canadian Journal of Psychiatry. 2013, 58:

    Google Scholar 

  32. Khan Arif, Cutler Andrew, Kajdasz Daniel, Gallipoli Susan, Athanasiou Maria, Robinson Donald, et al: A randomized, double-blind, placebo-controlled, 8-week study of vilazodone, a serotonergic agent for the treatment of major depressive disorder. The Journal of clinical psychiatry. 2011, 72: 441-447. 10.4088/JCP.10m06596.

    Article  CAS  PubMed  Google Scholar 

  33. Bechelli LP, Ruffino-Netto A, Hetem G: A double-blind controlled trial of pipotiazine, haloperidol and placebo in recently-hospitalized acute schizophrenic patients. Brazilian journal of medical and biological research=Revista brasileira de pesquisas medicas e biologicas/Sociedade Brasileira de Biofisica...[et al.]. 1983, 16: 305-311.

    CAS  Google Scholar 

  34. Finkel Richard, Clark Alexia Michelle, Cubeddu Luigi: Pharmacology. 2009, Philadelphia: Lippincott Williams & Wilkins, 4

    Google Scholar 

  35. Dorland's medical dictionary. January 30, 2008

  36. Lanni Cristina, Lenzken Silvia, Pascale Alessia, Del Vecchio Igor, Racchi Marco, Pistoia Francesca, et al: Cognition enhancers between treating and doping the mind. Pharmacological Research. 2008, 57: 196-213. 10.1016/j.phrs.2008.02.004.

    Article  CAS  PubMed  Google Scholar 

  37. Perez-Lloret Santiago, Rey Veronica Maria, Ratti Lucca Pietro, Rascol Olivier: Rotigotine transdermal patch for the treatment of Parkinson's Disease. Fundamental & clinical pharmacology. 2013, 27: 81-95. 10.1111/j.1472-8206.2012.01028.x.

    Article  CAS  Google Scholar 

  38. Romero Alarcon-de-la-Lastra, Lopez A, Martin MJ, La Casa C, Motilva V: Cinitapride Protects against Ethanol-lnduced Gastric Mucosal Injury in Rats: Role of 5-Hydroxytryptamine, Prostaglandins and Sulfhydryl Compounds. Pharmacology. 1997, 54: 193-202. 10.1159/000139487.

    Article  Google Scholar 

  39. Nichols David, Nichols Charles: Serotonin receptors. Chemical reviews. 2008, 108: 1614-1641. 10.1021/cr078224o.

    Article  CAS  PubMed  Google Scholar 

  40. Way WL, Fields HL, Way EL, Katzung BG: Basic and clinical pharmacology. Basic and clinical pharmacology. 1998

    Google Scholar 

  41. Mard-Soltani Maysam, Kesmati Mahnnaz, Khajehpour Lotfolah, Rasekh Abdolrahman, Shamshirgar-Zadeh Abdolhosein: Interaction between Anxiolytic Effects of Testosterone and β-1 Adrenoceptors of Basolateral Amygdala. International Journal of Pharmacology. 2012, 8:

    Google Scholar 

  42. Elenkov Ilia, Wilder Ronald, Chrousos George, Vizi Sylvester: The sympathetic nerve--an integrative interface between two supersystems: the brain and the immune system. Pharmacological reviews. 2000, 52: 595-638.

    CAS  PubMed  Google Scholar 

  43. Gutman Arie, Zaltsman Igor, Shalimov Anton, Sotrihin Maxim, Nisnevich Gennady, Yudovich Lev, et al: Process for the preparation of dexmethylphenidate hydrochloride. 2007, ed: Google Patents

    Google Scholar 

  44. Future Treatments for Depression, Anxiety, Sleep Disorders, Psychosis, and ADHD --

  45. Pierre Fabre Medicament and Forest Laboratories to Collaborate on Development and Commercialization of F2695 for Depression - FierceBiotech.

  46. News: Forest Buys CNS Disease-Related Drug for $75M Upfront.

  47. Maglione Margaret, Miotto Karen, Iguchi Martin, Jungvig Lara, Morton Sally, Shekelle Paul: Psychiatric effects of ephedra use: an analysis of Food and Drug Administration reports of adverse events. American Journal of Psychiatry. 2005, 162: 189-191. 10.1176/appi.ajp.162.1.189.

    Article  PubMed  Google Scholar 

  48. Adkins DE, Khachane AN, McClay JL, Aberg K, Bukszar J, et al: SNPbased analysis of neuroactive ligand-receptor interaction pathways implicates PGE2 as a novel mediator of antipsychotic treatment response: Data from the CATIE study. Schizophrenia Research. 2012, 135: 200-201. 10.1016/j.schres.2011.11.002.

    Article  PubMed  Google Scholar 

  49. Hypertension. (2014, May 20) In Wikipedia, the free encyclopedia. Retrieved May 20, 2014, []

  50. Review: could Trimipramine maleate cause Essential hypertension?. Jun, 18, 2014, []

  51. [Internet]. Paliperidone Information from; c2000-2014 [updated June 16th, 2014; Cited: 2014 June 18]. []

  52. Bredberg U, Eyjolfsdottir GS, Paalzow L, Tfelt-Hansen P, Tfelt-Hansen V: Pharmacokinetics of methysergide and its metabolite methylergometrine in man. European journal of clinical pharmacology. 1986, 30: 75-77. 10.1007/BF00614199.

    Article  CAS  PubMed  Google Scholar 

  53. Jasek W: Austria-Codex (in German) (62nded.). 2007, Vienna: Österreichischer Apothekerverlag, 5193-5.

    Google Scholar 

  54. Montes Barbara Amy, Rey Jose: Iloperidone (Fanapt): An FDA-Approved Treatment Option for Schizophrenia. Pharmacy and Therapeutics. 2009, 34: 606-

    PubMed Central  Google Scholar 

  55. Jeffrey Berman, Setty Arathi, Steiner Matthew, Kaufman Kenneth, Skotzko Christine: Complicated hypertension related to the abuse of ephedrine and caffeine alkaloids. Journal of addictive diseases. 2006, 25: 45-48. 10.1300/J069v25n03_06.

    Article  Google Scholar 

  56. de Toledo Ferraz Alves TC, Guerra de Andrade A: Hypertension induced by regular doses of milnacipran: a case report. Pharmacopsychiatry. 2007, 40: 41-42.

    Article  CAS  PubMed  Google Scholar 

  57. Munoli Neelakanthappa Ravindra, Selvaraj Arun, Praharaj Kumar Samir, Bhandary Rajeshkrishna: Desvenlafaxine-Induced Worsening of Hypertension. The Journal of neuropsychiatry and clinical neurosciences,. 2013, 25: E29-E30.

    Article  Google Scholar 

  58. Rummery NM, Hill CE: Vascular gap junctions and implications for hypertension. Clin Exp Pharmacol Physiol. 2004, 31: 659-667. 10.1111/j.1440-1681.2004.04071.x.

    Article  CAS  PubMed  Google Scholar 

  59. von Eichborn J, Murgueitio MS, Dunkel M, et al: PROMISCUOUS: a database for network-based drug-repositioning. Nucleic Acids Res. 2011, 39: D1060-6. 10.1093/nar/gkq1037.

    Article  CAS  PubMed  Google Scholar 

  60. [,High+Blood+Pressure/?a=s;http//]

  61. Derby Michael, Zhang Lu, Chappell Jill, Gonzales Celedon, Callaghan JT, Leibowitz Mark, et al: The effects of supratherapeutic doses of duloxetine on blood pressure and pulse rate. Journal of cardiovascular pharmacology. 2007, 49: 384-393. 10.1097/FJC.0b013e31804d1cce.

    Article  CAS  PubMed  Google Scholar 

  62. []

  63. []

  64. Mago Mahajan, Thase ME: Levomilnacipran: a newly approved drug for treatment of major depressive disorder. Expert Rev Clin Pharmacol. 2014, 7 (2): 137-45. 10.1586/17512433.2014.889563. Mar

    Article  CAS  PubMed  Google Scholar 

Download references


This work is supported by the National Natural Science Foundation of China (Nos. 61202174, 61432010, 91130006, 61173093, 61202175), the Fundamental Research Funds for the Central Universities (Nos. K5051303010, BDZ021404, BDY181417, BDY10), and the Research Fund for the Doctoral Program of Higher Education of China (Grant No. 20120203120015).


Publication of this article has been funded by the National Natural Science Foundation of China (No. 61202174).

This article has been published as part of BMC Medical Genomics Volume 8 Supplement 2, 2015: Selected articles from the 4th Translational Bioinformatics Conference and the 8th International Conference on Systems Biology (TBC/ISB 2014). The full contents of the supplement are available online at

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Liang Yu or Lin Gao.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

LY carried out the study. LY, JZ, and YZ designed the study. LY wrote the first draft of the manuscript. JH, ZM, and LG revised the manuscript. All the authors read and approved the final manuscript.

Electronic supplementary material

Additional file 1: Table illustrating the relations between drugs and targets. (PDF 88 KB)

Additional file 2: Table illustrating the list of protein complexes. (PDF 167 KB)

Additional file 3: Table illustrating disease-gene dataset. (PDF 158 KB)

Additional file 4: Table illustrating the information of drug-complex network. (PDF 116 KB)

Additional file 5: Table illustrating the information of complex-disease network. (PDF 633 KB)


Additional file 6: Table illustrating the information of drug-disease network before being filtered by PPI network and weight. (PDF 1 MB)


Additional file 7: Table illustrating the information of drug-disease network after being filtered by PPI network and weight. (PDF 23 KB)

Additional file 8: Table illustrating the drug-mental disorders relations predicted by our method. (PDF 9 KB)

Additional file 9: Table illustrating 23 clusters got from drug-drug network. (PDF 9 KB)

Additional file 10: Table illustrating the drug-hypertension relations predicted by our method. (PDF 11 KB)

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, L., Huang, J., Ma, Z. et al. Inferring drug-disease associations based on known protein complexes. BMC Med Genomics 8 (Suppl 2), S2 (2015).

Download citation

  • Published:

  • DOI: