 Research
 Open Access
 Published:
MiRNAdisease interaction prediction based on kernel neighborhood similarity and multinetwork bidirectional propagation
BMC Medical Genomics volume 12, Article number: 185 (2019)
Abstract
Background
Studies have shown that miRNAs are functionally associated with the development of many human diseases, but the roles of miRNAs in diseases and their underlying molecular mechanisms have not been fully understood. The research on miRNAdisease interaction has received more and more attention. Compared with the complexity and high cost of biological experiments, computational methods can rapidly and efficiently predict the potential miRNAdisease interaction and can be used as a beneficial supplement to experimental methods.
Results
In this paper, we proposed a novel computational model of kernel neighborhood similarity and multinetwork bidirectional propagation (KNMBP) for miRNAdisease interaction prediction, especially for new miRNAs and new diseases. First, we integrated multiple data sources of diseases and miRNAs, respectively, to construct a novel disease semantic similarity network and miRNA functional similarity network. Secondly, based on the modified miRNAdisease interactions, we use the kernel neighborhood similarity algorithm to calculate the disease kernel neighborhood similarity and the miRNA kernel neighborhood similarity. Finally, we utilize bidirectional propagation algorithm to predict the miRNAdisease interaction scores based on the integrated disease similarity network and miRNA similarity network. As a result, the AUC value of 5fold cross validation for all interactions by KNMBP is 0.93126 based on the commonly used dataset, and the AUC values for all interactions, for all miRNAs, for all disease is 0.93795、0.86363、0.86937 based on another dataset extracted by ourselves, which are higher than other stateoftheart methods. In addition, our model has good parameter robustness. The case study further demonstrated the predictive performance of the model for novel miRNAdisease interactions.
Conclusions
Our KNMBP algorithm efficiently integrates multiple omics data from miRNAs and diseases to stably and efficiently predict potential miRNAdisease interactions. It is anticipated that KNMBP would be a useful tool in biomedical research.
Background
MicroRNAs (miRNAs) are a category of singlestranded smallnoncoding RNAs(~ 22 nt) which play important roles in gene regression via interference in posttranscriptional regulation [1, 2]. In the past decades, microRNAs were found in eukaryotes and viruses besides prokaryotes [3]. Previous research has shown that miRNAs was related to several human diseases like cancer, Alzheimer’s disease and Diabetes Mellitus etc. [4,5,6]. miR375 was found to be significant in the growth and response to metabolic stress of pancreatic islets [7].miR21 negatively regulated Pdcd4 which can suppress TPAinduced neoplastic transformation [8]. miRNA200 was detected in the metastasis of gastric adenocarcinoma cells [9]. miR146a is a tumor suppressor inhibit NFκB activity related to promotion and suppression of tumor growth [10].
Wang et al. [11] constructed a Directed Acyclic Graph (DAG) to describe a disease based on the MeSH descriptors. Then they calculated the disease semantic similarity by the DAG, and combined with the known miRNAdiseases interaction to construct the miRNA functional similarity, which was also used to preliminarily infer new potential functions or related diseases of miRNAs. Xu et al. [12] proposed a support vector machine (SVM) to predict the interaction between miRNA and tumor, but since the current database rarely provides a list of non–cancer miRNAs, therefore, the lack of negative samples leads to a supervised learning model that is not well suited for largescale diseasemiRNA interaction prediction.
The miRNAdisease interaction prediction problem can be regarded as a classification problem that lacks negative samples. According to this feature, a large number of networkbased semisupervised methods have been proposed, most of which are based on similar miRNAs (diseases) are more likely to interact with the same disease (miRNA). Chen et al. [13] adopted restart random walk (RWRMDA) to predict the potential miRNAdisease interaction, which restarted the known miRNAdisease interaction network, using random walks on miRNA functional similarity network to predict potential miRNAdisease interaction. Since the restart operator of RWRMDA is based on the known miRNAdisease interaction network, this method does not apply to predictions of new diseases that are not associated with any miRNA. The regularized least squares algorithm (RLSMDA) was also proposed by Chen et al. [14] in 2015 to predict miRNAdisease interactions, which uses both the disease semantic similarity and the miRNA functional similarity to calculate miRNAdisease interaction scores, and the weighted linear combination of the two scores was used as the final result. The method combined disease similarity network and miRNA similarity network to predict simultaneously, which improves the prediction accuracy and enhanced the predictive power of the model to some extent. However, the model is highly dependent on parameters, and how to set appropriate parameters is the defect of the model. Subsequently, in 2018, Chen et al. [15] released a Graph Regression model to predict miRNA–disease interactions by using singular value decomposition (SVD) to decompose the interaction matrix, the disease similarity matrix and the miRNA similarity matrix, then using partial least squares (PLS) to perform graph regression in interaction space, miRNA similarity space, and disease similarity space. SVD decomposition and PLS regression can eliminate noise to a certain extent, but it also causes information loss, which leads to the reduction of model accuracy. Recently, Chen et al. proposed two novel models: the hierarchical clustering recommendation algorithm [16] (BNPMDA) and the low rank matrix decomposition [17] (IMCMDA) algorithm to predict potential miRNA–disease interactions. Both models have the advantage of fewer parameters, but the former uses only known miRNAdisease interaction networks for inference, so it cannot predict new miRNAs and new diseases, and the latter leads to a reduction in prediction accuracy due to matrix decomposition. The miRNA functional similarity used in the above algorithms is based on the method of Wang et al. [11], which depends on the known miRNAdisease interactions, so these models cannot predict new miRNAs.
Luo et al. [18] proposed a Kronecker regularized least squares, which calculated miRNA functional similarity based on miRNAgene interaction network and gene weight network, combined with disease semantic similarity to predict potential miRNAdisease interactions. The model enhances the predictive power of new miRNAs by integrating heterogeneous omics data of miRNAs, but the model is highly dependent on the weight coefficients of different similarity measurements, which greatly affects its promotion and practical application ability. Xiao et al. [19] constructed a graph regularized nonnegative matrix factorization method, which decomposes the modified known miRNAdisease interaction network, and uses miRNA functional similarity and disease semantic similarity to construct regularization operators for prediction. The model can predict new miRNAs and new diseases, but more model parameters and stronger parameter dependencies also reduce the performance of the model. Both of these models use information outside the miRNAdisease interaction dataset to construct miRNA functional similarity, which enhances their ability to predict new miRNAs. However, they only use MeSH descriptors to describe disease similarity, resulting in a sparsely diseased network, which limits the predictive performance of the model.
Here, we propose a new framework, kernel neighborhood similarity and multinetwork bidirectional propagation (KNMBP), which uses multiple omics data to infer unknown miRNAdisease interactions. KNMBP uses diseasegene interactions, diseasebiological process interactions, and disease semantic information to construct a novel disease semantic similarity network, using miRNAtarget interactions and gene weight networks to construct a novel miRNA functional similarity network. Different from previous methods, the miRNA functional similarity and disease semantic similarity calculated in this paper does not utilize the known miRNAdisease interaction, but excavates more feature information of miRNA and disease from other latest datasets, which greatly expands our ability to predict new miRNA and disease. The accumulated research [15, 20] shows that the known miRNAdisease interaction network also contains important feature information of miRNA and disease, and the reasonable use of this information can well enhance the prediction ability of the model. In these considerations, based on the modified miRNAdisease interaction, we use the kernelbased neighborhood similarity algorithm to calculate the disease kernel neighborhood similarity and miRNA kernel neighborhood similarity. Finally, based on the integrated miRNA (disease) similarity network, we constructed a bidirectional propagation model to predict potential miRNAdisease interaction scores. The experimental results show that KNMBP not only has a good ability to predict new interactions, new miRNAs and new diseases, but also has the advantage of parameter robustness.
Methods
Methods overview
To predict unknown miRNAdisease interactions, we propose a new KNMBP model with five parts, as shown in Fig. 1. First, we calculate miRNA functional similarity and disease semantic similarity by using multiple histological data other than miRNAdisease interaction information (as shown in step 1 of Fig. 1). Second, based on the modified known miRNAdisease interaction network, we use the kernelbased neighborhood similarity model (KSNS) to calculate the disease kernel neighborhood similarity and miRNA kernel neighborhood similarity (as shown in step 2 and step 3 of Fig. 1). Finally, based on the integrated miRNA (disease) similar network calculated by Diffusion Component Analysis (clusDCA), we released a bidirectional propagation algorithm to predict unknown miRNAdisease interaction scores (as shown in step 4 and step 5 in Fig. 1).
Dataset collection
In order to fairly compare the performance of the model, we used two benchmark datasets to conduct experiments.
For benchmark dataset I, we utilized the dataset of miRNAdisease interaction prediction established by Chen et al. [16, 17]. The dataset I consists of three parts: First, 5430 interactions between 383 diseases and 495 miRNAs were extracted from HMDD v2.0 [21]. Second, based on the Medical Subject Headings (MeSH) descriptors in the U.S. National Library of Medicine, two semantic similarity matrices of diseases were established by wang et al. [11] and Xuan et al. [22], respectively. Third, the functional similarity matrix of miRNA was established by Lu et al. [23]. All these data can be downloaded from https://github.com/IMCMDAsourcecode/IMCMDA. However, Dataset I is based on the old version (HMDD v2.0), and it also has the disadvantage that the disease semantic similarity is very sparse and the miRNA functional similarity depends on the known miRNAdisease interaction. Therefore, we extracted information about miRNAs and diseases from several latest databases and built benchmark dataset II. We describe the establishment of dataset II from three aspects.
First, extract information about the disease. The Comparative Toxicogenomics Database (CTD) is an important database of disease research that provides a wealth of interactive information between disease and chemistry, genetic products, phenotypes and the environment [24]. Disease items in CTD are described by MeSH ID, which is a hierarchical vocabulary that provides a strict classification system for studying the relationships among various diseases, and the relationships between any diseases can be illustrated by a directed acyclic graph (DAG). For example, the MeSH ID of the disease “Deletion Syndrome (Partial)” was “MesH:C538288” in CTD, whose parent diseases are “Chromosome Deletion” and “Chromosome Disorders”, and the corresponding MesH ID were “MesH:D002872” and “MesH: D025063”, respectively. In order to get a detailed description of the disease, we download 12,988 diseases, including the names of diseases, multiple ID representations of the diseases, and information about their parent nodes. Furthermore, we downloaded genedisease interactions, including 25,114,553 interactions between 46,045 genes and 7163 diseases. At the same time, diseaseGO biological process interactions, including 1,727,119 interactions between 13,126 GOs and 7116 diseases were also downloaded.
Second, extract information about the miRNA. In order to accurately describe the relationship between miRNAs, we extracted as complete as possible miRNA interaction information from multiple latest databases. We obtained the miRNAgene interaction information from experimentally verified databases, including TarBase (version 8.0) [25], miRTarBase (version 7.0) [26], miRNAMAP (version 2.0) [27], miRecord (version 4) [28]. DIANATarBase v8 is a reference database for indexing experimentally supported microRNA targets, has more than a decade of support in the field of noncoding RNA [25]. We downloaded 927,119 miRNAgene interactions from the database, after the removal of nonhuman gene and converted the gene ID into Entrez Gene identifiers, a total of 423,392 interactions between 18,345 genes and 1084 miRNAs are retained. Meanwhile, we performed ID transformation of the genes in the miRTarBase database, deleted the null miRNAs and target genes, and finally obtained 381,088 interactions between 2599 miRNAs and 15,064 genes. Similarly, we extracted 83,071 interactions between 1135 target genes and 471 miRNAs from miRNAMAP, and obtained 1269 interactions between 767 target genes and 203 miRNAs from the miRecord. Based on miRBase [29], all of the above miRNAs were transformed into the v22 version using the R package ‘miRBaseConverter’, and the null and duplicate miRNAs were deleted. After integration, a total of 588,134 interactions between 2814 miRNAs and 18,468 genes were obtained. In addition, Lee et al. [30] integrated 21 omics data from multiple organisms by modifying bayes and used logarithmic likelihood scores to measure the probability of interaction between two genes with true functional links. To build similarity networks of genes, we downloaded the human weighted gene network data from the HumanNet database, which contained the log likelihood score of 476,399 interactions among 16,243 genes.
Third, extract interactive information of miRNA and disease. The human microRNA Disease Database (HMDD) collects large amounts of human miRNAdisease interactions from genetics, epigenetics, circulating miRNA and miRNA target interactions, and provides detailed annotation of miRNAdisease interactions [21]. In June 28, 2018, HMDD (version 3.0) [31] was also released, which provides 200.2% of human miRNAdisease interactions and has more evidence to classify. We extracted the disease information with MeSH ID or OMIM ID from HMDD v3.0, removed duplicate miRNAdisease interactions, and obtained 14,457 interactions between 1045 miRNAs and 627 diseases. To ensure all the miRNA similarity and all the disease similarity can be calculated, we delete the diseases and miRNAs not in the above two datasets, and finally got 10,561 interactions between 574 miRNAs and 579 diseases. The details of the two benchmark datasets are shown in Additional file 1.
Construction of disease semantic similarity network
In fact, most methods use MeSH descriptors to construct a directed acyclic graph of the disease, which contains common information between different diseases is used to describe the disease similarity, which leads to a sparsely similar network [16, 17]. In order to construct a more reasonable disease semantic similarity, we make full use of the various omics data to calculate the similarity of the disease. Proteinencoding genes can affect the pathogenesis of the disease to some extent [32], so diseasegene interactions also imply some features of the disease. Similarly, the gene ontology biological process of the disease is also the reflection of some characteristics of the disease. In this paper, we combine the diseasegene interactions (DG) and diseaseGO biological process interactions datasets (DGO), and the MeSH descriptors of the disease, using the MultiSourcDSim model proposed by Lei et al. [33] to calculate the disease semantic similarity.
Based on the MeSH descriptor, a directed acyclic graph (DAG) can be used to describe the semantic relationship between diseases. Any disease d in the DAG can be expressed as DAG(d) = (d, S(d), F(d), A(d)), where S(d) and F(d), representing the set of direct child nodes and direct parent nodes of disease d, respectively, and A(d) represents the set constituted by all ancestor nodes of disease d.
First, combining the disease interaction dataset (DG or DGO) and DAG, the frequency FT_{c}(d) of any disease d in the DAG can be calculated:
where f_{c}(d) represents the frequency of d in the interaction dataset c, it can be seen that the occurrence frequency of d in DAG is equal to the sum of the occurrence frequency of all its direct child nodes and the frequency of itself in the interaction dataset. Then, normalize the frequency of disease occurrence as follow:
Where, PT_{c}(root) represents the occurrence frequency of the root node in DAG. According to Eqs. 1 and 2, it can be known that 0 ≤ PT_{c}(t) ≤ 1. Based on the more information shared, the higher the similarity. The disease similarity can be obtained:
Where, COM(d_{1}, d_{2}) is the set of the minimum common ancestor of the disease d_{1} and d_{2}, and it is easy to see that 0 ≤ S_{c}(d_{1}, d_{2}) ≤ 1. According to DG and DGO, we can obtain two disease similarity networks {S_{c}, c = 1, 2}. After that, the clusDCA [34] was used to integrate the disease similar networks, and the integrated semantic similar network SS_{d} was finally obtained.
Construction of miRNA functional similarity network
In order to overcome the dependence of miRNA functional similarity on known miRNAdisease interaction network, the algorithm can predict miRNAs not associated with any disease. We calculate the miRNA functional similarity by means of Luo [18] and Xiao’s [19] methods. Specifically, we used miRNA target gene interaction network and gene similarity network to calculate miRNA similarity.
First, we normalized and symmetrized the loglikelihood score data between genes downloaded from HumanNet:
Where S^{g}(g_{i}, g_{j}) represents the similarity between gene g_{i} and gene g_{j}, LLS(i, j) represents the loglikelihood score between gene g_{i} and gene g_{j}, MAX_{LLS} represents the maximum loglikelihood score. At this point, we can define the similarity between any gene g_{i} and any gene set G:
Where, S^{g}(g_{i}, G) represents the similarity between g_{i} and G. Then, we can get the functional similarity between miRNA m_{i} and miRNA m_{j}:
Where, SF_{m}(m_{i}, m_{j}) represents the functional similarity between m_{i} and m_{j}, G_{i} represent the gene set associated with m_{i}, and G_{i} represent the number of genes in the set G_{i}.
Kernelbased neighborhood similarity
Reasonable use of known miRNAdisease interaction information can greatly improve the performance of the model [17, 18]. In this paper, based on the known miRNAdisease interactions, we used the kernelbased neighborhood similarity (KSNS) [35] to calculate miRNA (disease) kernel neighborhood similarity. KSNS not only comprehensively utilizes the distance similarity and structural similarity of samples, but also fully excavates the nonlinear structural similarity information between samples, achieving a good prediction effect in lncRNAprotein interaction prediction. In addition, to overcome the sparse problem of the interaction matrix, a weighted kneighborhood profile (WKNNP) algorithm was proposed by Xiao et al. [19] to preprocess the interaction matrix, achieved good results. Based on the above two points, we first use WKNNP to preprocess the known interaction matrix, and then uses KSNS to calculate the kernel neighborhood similarity of miRNA (disease).
Let the matrix X of the NM rows and ND columns represent the miRNAdisease interaction matrix, then X can be expressed as: \( \mathrm{X}=\left[{M}_1^T,{M}_2^T,\cdots, {M}_{NM}^T\right]=\left[{D}_1,{D}_2,\cdots, {D}_{ND}\right] \), where M_{i} is the ith row vector of X, could be regarded as the interaction profile feature of miRNA m_{i}; D_{j} is the jth column vector of X, could be regarded as the interaction profile feature of disease d_{j}.
According to the WKNNP algorithm, we make use of Knearest neighbor feature of m_{i} to enrich the interaction profile M_{i}, then the modified interaction profile \( {\hat{M}}_i \) of m_{i} is as follows:
Where \( {Q}_{m_i}={\sum}_{m_{j\in N\left({m}_i\right)}}{SF}_m\left({m}_i,{m}_j\right) \) denotes regularization weight, and N(m_{i}) represents the K nearest set of m_{i} (For sake of simplicity, let K = 15 in the paper). w^{k} is the weight coefficient of the kth neighbor, and decay factor α ∈ [0, 1] (For sake of simplicity, let α = 0.8 in the paper), It is easy to see that the more closer miRNAs have higher weight coefficients. At this point, the modified interaction profile matrix can \( {X}_M=\left[{\hat{M}}_1^T,{\hat{M}}_2^T,\cdots, {\hat{M}}_{NM}^T\right] \) be obtained through Eq. 7. Similarly, we can get the disease modified interaction profile matrix \( {X}_d=\left[{\hat{D}}_1,{\hat{D}}_2,\cdots, {\hat{D}}_{ND}\right] \). Finally, the modified interaction profile matrix X is shown as follows:
Now, based on the \( \hat{\mathrm{X}} \), we make use of KSNS to calculate miRNA (disease) kernel neighborhood similarity. First, we construct the Kneighboring discriminant matrix of miRNA based on the miRNA functional similarity:
Where N(m_{i}) represents the set of NK nearest miRNAs of m_{i}, NK = ⌊PN × N⌋, PN denotes neighbors proportion parameter, N is the total number of samples, ⌊∙⌋ means round down. Then weight matrix W of miRNA is as follow:
Where, Φ(∙) denotes kernel function, ‖∙‖_{F} representsFrobenius norm, ⨀ is an elementbyelement multiplication, μ_{1} is nonneighborhood control parameters, μ_{2} is similarity regularization parameters, e = (1, 1, ……, 1)^{T}. The first item of constraint requires the sum of reconstruction weights of each sample to be 1, the second requires that all elements in W are nonnegative, and the third term indicates that the selfsimilarity of miRNA is 0. Using the Lagrange multiplier method and the KarushKuhnTucker (KKT) condition, the iterative formula of W is as follows:
Where k(X, X) represents the kernel matrix of X. In this paper, we select Gaussian kernel function, which is represented as:
Where k(x_{i}, x_{j}) is the kernel of any two samples of x_{i}, x_{j}. \( \upgamma =\frac{\sum {\left\Vert {x}_i\right\Vert}^2}{NM} \) represents the regularized bandwidth parameter. After that, we conducted multiple normalization operations on the weight matrix W to obtain the miRNA kernel neighborhood similarity matrix SI_{m}, and the normalization formula is as follows:
Where, the diagonal matrix D = diag (d_{1}, d_{2}, …, d_{NM}), \( {d}_j=\sum \limits_{i=1}^{NM}{W}_{i,j} \). Similarly, we can get the disease kernel neighborhood similarity SI_{d}. Then the clusDCA [34] was used to integrate the miRNA functional similarity SF_{m} (disease semantic similarity matrix SS_{d}) and kernel neighborhood similarity SI_{m} (kernel neighborhood similarity SI_{d}) to obtain the final miRNA similarity matrix S_{m}= (disease similarity matrix S_{d}).
Bidirectional propagation algorithm
Based on miRNA similarity, disease similarity and known miRNAdisease interaction information, we proposed a bidirectional propagation algorithm to predict the miRNAdisease interaction score.
Let (F)_{NM × ND} be the miRNAdisease interaction score matrix, then F can be decomposed as \( F=\left[{FM}_1^T,{FM}_2^T,\cdots, {FM}_{NM}^T\right]=\left[{FD}_1,{FD}_2,\cdots, {FD}_{ND}\right] \), Where, \( {FM}_i^T \) represents the predicted interaction score of miRNA m_{i} with all diseases, and FD_{j} denotes the predicted interaction score of disease d_{j}. Based on the hypothesis that higher similarity miRNAs are more likely to be interacted with the same disease, we can get:
Where \( {s}_{i,j}^m={\left({S}_m\right)}_{i,j} \) denotes the similarity of m_{i} and m_{j}. \( {d}_i^m=\sum \limits_{j=1}^{NM}{s}_{i,j}^m \), and the diagonal matrix \( {D}_m=\mathit{\operatorname{diag}}\left({d}_1^{\mathrm{m}},{d}_2^{\mathrm{m}},\cdots, {d}_{NM}^{\mathrm{m}}\right) \). Similarly for diseases, we can get:
Where \( {s}_{u,v}^d={\left({S}_d\right)}_{u,v} \) denotes the similarity of d_{u} and d_{v}. \( {d}_u^d=\sum \limits_{k=1}^{ND}{s}_{u,k}^d \), and the diagonal matrix \( {D}_d=\mathit{\operatorname{diag}}\left({d}_1^d,{d}_2^d,\cdots, {d}_{ND}^d\right) \). By this stage, the bidirectional propagation algorithm can be obtained as follows:
Where \( {\left\Vert FY\right\Vert}_F^2 \) represents the overall prediction error, which is required to be as small as possible, λ_{m} and λ_{d} are the Laplacian regularization parameters of miRNA and disease, respectively. The derivative of Eq. 16 for F is as follows:
In order to speed up the optimization of the gradient algorithm, we use AdaGrad algorithm [34] to adaptively choose the gradient step size. The details of the optimization algorithm to the proposed bidirectional propagation model are described in Algorithm 1.
Results
Comparison with other methods
Experimental settings
To evaluate the performance of the KNMBP algorithm fairly, we performed the 5fold crossvalidation (CV) on Dataset I and Dataset II, and compared with the following methods: IMCMDA [17], BNPMDA [16] and RLSMDA [14], KRLSM [18], RWRMDA [13]. Specifically, for each method, we performed CV four times, each time using a different seed, and the mean value of the AUC values under different seeds was taken as the final AUC value of the method. The miRNAdisease interaction matrix Y ∈ R^{NM × ND} had NM rows for miRNAs and ND columns for diseases. We carried out three types of CV as follows [36]:
 (1)
CV_{a} : CV on all miRNAdisease pairs. In order to ensure that the known interactions could be evenly distributed, we randomly divided the known and unknown interactions into five equal parts, one of which was selected as the test set in turn, and the association contained in it was deleted as the training set.
 (2)
CV_{m} : CV on miRNAs (row vectors in Y), all miRNAs were randomly divided into five equal parts, one of which was selected as the test set in turn, and its association was deleted as the training set.
 (3)
CV_{d} : CV on diseases (column vectors in Y), all diseases were randomly divided into five equal parts, one of which was selected as the test set in turn, and its association was deleted as the training set.
In each crossover experiment, Under CV_{a}, 80% of Y elements are used as the training set, and the remaining 20% are test set; Under CV_{m}, 80% of rows in Y are used as the training set, and the remaining 20% are test set; Under CV_{d}, 80% of columns in Y are used as the training set, and the remaining 20% are test set. In Dataset I, since the disease semantic similarity matrix is sparse, and the miRNA functional similarity relies on known miRNAdisease interactions, most of the methods only perform CV_{a} experiment. Therefore, we only perform CV_{a} on Dataset I, and perform the above three CV on Dataset II.
In this paper, we use the grid method to find the optimal combination of parameters. For KNMBP, the parameters are as follows: neighbors proportion parameter PN was selected from {10%, 30%, 50%, 70%, 90%}; nonneighborhood control parameters μ_{1} and similarity regularization parameters μ_{2} were selected from { 2^{0}, 2^{1}, 2^{2}, 2^{3}, 2^{4} }; For Laplace regularization parameters λ_{m} and λ_{d}, we set λ_{m} = λ_{d} and choose the two parameters from { 2^{−2}, 2^{−1}, 2^{0}, 2^{−1}, 2^{−2} }. For RWRMDA, {0, 0.1, ⋯, 0.9} for restart probability r and {1, 2, 3, ⋯, 6} for walk times; For KRLSM, with the authors’ recommendations, we set σ = 1, the weight parameters were selected from {0, 0.1, ⋯, 1};For RLSMDA, weight parameters w = 0.5 , the regularization parameters η_{m} = η_{d} and were selected from {0, 0.1, ⋯, 1}; For IMCMDA, the subspace dimension r was selected from {50, 100, ⋯, 500}.
Cross validation
For each CV, we calculated the prediction interaction scores of the test set by the above six methods, and normalized all the prediction interaction scores as follows:
Where PS(i, j) represents the predicted interaction score of miRNA m_{i} and disease d_{j}, minPS represents the minimum value of PS, and maxPS represents the maximum value of PS. Then, the [0,1] interval is equally divided into 1000, and each of the points is sequentially selected as a threshold, and calculate the True Positive Rate (TPR, sensitivity) and False Positive Rate (FPR, 1specificity) under each specific threshold. After that, we calculate the mean value of the TPR and the FPR for each threshold under CV, draw the corresponding TPR and FPR curve. Figure 2 shows the optimal AUC and corresponding ROC curves for each model under CV. The optimal parameters of KNMBP and the corresponding AUC values are shown in Additional file 2.
In the above experiment, CV_{a} tested the predictive performance of the model for new interactions, and CV_{m} and CV_{d} tested the predictive performance for new miRNAs and new diseases, respectively. It can be seen that our method (KNMBP) achieves the best prediction results in Fig. 2. Specifically, based on Dataset I, the AUC value of KNMBP for CV_{a} can reach 0.93126, which is 9.67, 5.69, 11.57, 3.41, and 10.31% higher than RWRMDA, RLSMDA, BNPMDA, KRLSM, and IMCMDA, respectively. Based on Dataset II, the AUC value of KNMBP for CVa can reach 0.93795, which is 7.97, 3.58, 13.68, 5.31 and 16.49% higher than the other five methods respectively. Since BNPMDA based on binary recommendation algorithm needs to utilize known miRNAdisease interactions to achieve resource allocation, it cannot predict new miRNA and new diseases [16]. RWRMDA, which restarts the random walk on MiRNA similarity network, is also not suitable for prediction of new diseases [13]. Therefore, RLSMDA, KRLSM and IMCMDA were selected as comparison algorithms under CV_{d}, and the AUC value of KNMBP could reach 0.86363, which was 7.66, 25.577 and 12.93% higher than the other three methods (RLSMDA, KRLSM, IMCMDA). For CV_{m}, the AUC of KNMBP can reach 0.86937, which is 0.62, 0.67, 11.09, 5.31 and 12.68% higher than the other four methods (RWRMDA, RLSMDA, KRLSM, IMCMDA), respectively.
Parametric sensitivity analysis
In machine learning, with the change of experimental scenarios, the optimal parameter combination may be very different, and the parameter selection may have a huge impact on the performance of the model, so the sensitivity analysis of parameters is often very important. In this section, we focus on the influence of four parameters, namely, neighbor proportion parameter PN, Laplace regularization parameter λ = λ_{m} = λ_{d}, nonneighborhood control parameter μ_{1} and similarity regularization parameter μ_{2}, on the prediction performance of the model. Let F_{cv = c}(PN = i, λ = j, μ_{1} = s, μ_{2} = t) represent the AUC value of the KNMBP algorithm when cv = c, c ∈ {1, 2, 3, 4} is performed and the parameters are set to PN = i, λ = j, μ_{1} = s, μ_{2} = t. In order to facilitate the visualization of the results, for each type of CV we combined the above four parameters in pairs to analyze the influence of the paired parameters on the predicted results of the model.
First, we consider the influence of neighbor proportion parameter PN and Laplace regularization parameter λ on the predictive performance of the model. When PN = i, λ = j, and the other two parameters change arbitrarily, we calculate the maximum AUC value of KNMBP (\( {\mathrm{maxAUC}}_{i,j}^c \)), the average AUC value (\( {\mathrm{meanAUC}}_{i,j}^c \)) and the minimum AUC value (\( {\mathrm{minAUC}}_{i,j}^c \)), as shown below:
Where μ_{1} ∈ ∀ and μ_{2} ∈ ∀ represent arbitrary values of the parameters μ_{1} and μ_{2} within their range (μ_{1} , μ_{2}∈ { 2^{0}, 2^{1}, 2^{2}, 2^{3}, 2^{4} }). When cv = 1, it means we perform CV_{a} on Dataset I; cv = 2 means we perform CV_{a} on Dataset II; cv = 3 means we perform CV_{d} on Dataset II; cv = 4 means we perform CV_{m} on Dataset II. In particular, under a certain CV, for every set of values of PN and λ, we first calculate the AUC values when μ_{1} and μ_{2} are arbitrarily changed within their range, then calculate the maximum, average and minimum values of this group of AUC values according to (20), and the results are shown in Fig. 3.
It can be seen from Fig. 3 that with the change of neighbor proportional parameter PN and Laplace regularization parameter λ, the AUC value of the model has a trend fluctuation, but the overall fluctuation range is small. Specifically, as shown in (a) of Fig. 3, the minAUC is 0.92322 when PN = 0.1 and λ = 4, and the maxAUC is 0.93126 when PN = 0.1 and λ = 1/4, with an overall relative change of 0.87%. Similarly, in (b), (c), and (d) of Fig. 3, the relative ranges of overall AUC changes with respect to the model caused by PN or λ are 0.56, 0.61, and 0.29%, respectively. The result shows that KNMBP has strong stability related to neighbor proportional parameter PN and Laplace regularization parameter λ.
Now we consider the nonneighborhood control parameter μ_{1} and similarity regularization parameter μ_{2}. Similarly, When μ_{1} = s, μ_{2} = t, the other two parameters change arbitrarily, we calculate the maximum AUC value of KNMBP (\( {\mathrm{maxAUC}}_{s,t}^c \)), the average AUC value (\( {\mathrm{meanAUC}}_{s,t}^c \)) and the minimum AUC value (\( {\mathrm{minAUC}}_{s,t}^c \)), as shown below:
Where PN ∈ ∀ and λ ∈ ∀ represent arbitrary values of the parameters PN and λ within their range (PN ∈ {10%, 30%, 50%, 70%, 90%} , λ∈ { 2^{−2}, 2^{−1}, 2^{0}, 2^{−1}, 2^{−2} }). Then the effect of these two parameters on the prediction performance of the model is shown in Additional file 3. As can be seen from (a), (b), (c) and (d) in Additional file 3, when the parameters μ_{1} and μ_{2} change in a certain range, the maxAUC value, meanAUC value and minAUC value of the model are almost flat, indicating that these two parameters have little influence on the prediction performance of the model. According to Fig. 3 and Additional file 3, when the parameters of the model change within a certain range, KNMBP can always achieve better prediction performance, indicating that our algorithm has strong parameters robustness.
Case study
To further demonstrate the predictive performance of KNMBP algorithm for novel miRNAdisease interactions, experiments were performed on the older version of HMDD (v2.0, June 20, 2013), and the prediction results were validated with the newer version of HMDD (v3.0, June 28, 2018). We downloaded the miRNAdisease interactions from HMDD v2.0 and extracted the disease data with MeSH ID or OMIM ID according to the details of the disease provided by HMDD v3.0. After processing, we obtained 2157 interactions of 166 diseases and 299 miRNAs, and constructed semantic similarity scores of these diseases and functional similarity scores of these miRNAs according to (2.2.1) and (2.2.2). The KNMBP was used for prediction, and the candidate miRNAs of 166 diseases ranked according to their predicted scores were provided in Additional file 4. Figure 4 shows the confirmed ratio of candidate miRNAs for 11 diseases under different thresholds. For example, the top 10 predicted scores of candidate miRNAs for Bladder Neoplasms are all confirmed in HMDD v3.0. Twentyseven of the top 30 predicted scores were confirmed in HMDD v3.0. As can be seen from Fig. 4, most of the top candidate miRNAs for these diseases can be confirmed in the latest version.
In addition, in order to further test the validity of the predicted results, we divided the candidate miRNAs for each disease into two groups according to the predicted scores, called Top group and Bottom group respectively [19], with 20 candidate miRNAs in each group, and then used fisher’s exact test to evaluate the statistical differences between the two groups. Figure 5 shows the proportion of confirmed candidate miRNAs in the Top group and Bottom group of four diseases and the significance level p by fisher’s exact test. For example, 18 of the candidate miRNAs in Colon Neoplasms’s Top group were confirmed (proportion of 0.9), and 2 of the Bottom group were confirmed (proportion of 0.1), with a p value of 5.2959 × 10^{−7}. This suggests that the candidate miRNAs of Colon Neoplasms in the Top group are more likely to be confirmed than that in the Bottom group. Meanwhile, the p values were 1.4509 × 10^{−11} , 3.5997 × 10^{−4} , 2.4436 × 10^{−4} for Bladder Neoplasms, Glioma, Ovarian Neoplasms, respectively. The test results verified that the number of confirmed miRNAs in the Top group were significantly higher than that in the Bottom group, which further demonstrated the high efficiency of KNMBP algorithm in predicting new miRNAdisease interactions.
As shown in Additional file 5, the top 10 candidate miRNAs for these four diseases and their confirmation in HMDD v3.0 [31], miRCancer [37] and dbDEMC 2.0 [38]. Specifically, for Gladden Neoplasms and Colon Neoplasms, their top 10 candidate miRNAs were all confirmed in HMDD v3.0; For Glioma, 8 were confirmed in HMDD v3.0 and one was confirmed in miRCancer; For Ovarian Neoplasm, 9 were confirmed in HMDD v3.0 and one was confirmed in dbDEMC 2.0. Finally, all the interactions in Dataset II extracted from the current latest database were used as the training set, and the candidate miRNAs of 579 diseases predicted by KNMBP algorithm were sorted according to scores, as shown in Additional file 6.
Discussion
The KNMBP proposed in this paper not only has high performance in predicting unknown miRNAdisease interactions, but also can efficiently predict the new miRNA (disease), which not associated with any disease (miRNA). In order to fairly evaluate the performance of the model, we compare the performance of it and several stateoftheart models to the common Dataset (Dataset I) and the Dataset (Dataset II) extracted by ourselves for 5fold cross validation (CV). In Dataset I, the AUC value of KNPMBP could reach 0.93126 when we perform CV on interactions. In Dataset II, the AUC value of KNMBP could reach 0.93795, 0.86937 and 0.86363 when we perform CV on interactions, on miRNAs and on diseases, respectively. The predicted results of our method were all better than other methods. In order to evaluate the predictive performance of KNMBP for new miRNAdisease interactions, we extracted the data from the old version database and tested the predicted results with the new version. Statistical results of 11 diseases confirmed that most of the top candidate miRNAs could be confirmed in the new version dataset. We divided the candidate miRNAs of the four common tumors into the Top group and the Bottom group according to the predicted scores. The fisher’s exact test results further confirmed that the number of confirmed miRNAs in the Top group were significantly higher than that in the Bottom group. In addition, the results of parameter sensitivity analysis show that KNMBP algorithm has the advantage of parameter robustness when the parameters are taken in a wide range.
The reason why the KNMBP algorithm has higher performance is mainly due to the following aspects. First, we constructed more reasonable disease semantic similarity network and miRNA functional similarity network. Specifically, instead of using Directed Acyclic Graph (DAG) alone to describe the disease similarity, we comprehensively used the genedisease interactions, diseaseGO biological process interactions and the MeSH descriptor to calculate the disease similarity, and more fully mined the similarity information between diseases to obtain more dense and accurate disease similarity network. In addition, previous methods for constructing miRNA functional similarity network mostly rely on the known miRNAdisease interaction, therefore they cannot predict new miRNAs. In this paper, the miRNA functional similarity is calculated by integrating miRNAtarget gene interaction network and gene weight network, avoiding dependence on known miRNAdisease interactions and ensuring the prediction of new miRNAs. Secondly, in order to overcome the sparseness of the miRNAdisease interaction network and fully exploit the miRNA (disease) feature information, we utilized the weighted K neighborhood profiles to make a weighted correction on the sparse interaction network, taking advantage of neighborhood information to reduce the interaction network sparsity. Meanwhile, we used KSNS to calculate the miRNA (disease) kernel neighborhood similarity. Different from Gaussian function similarity and linear neighborhood similarity [20], KSNS not only makes full use of nonneighborhood information, but also fully excavates the nonlinear structural similarity between samples, consider both the distance similarity and the structural similarity of samples. Thirdly, we used diffusion component analysis to integrate the heterogeneous omics data of disease similarity and miRNA similarity respectively. The fused miRNA (disease) similarity network can not only effectively utilize the feature information among the known interactions, but also reflect the new similarity information obtained from other omics data. Fourthly, the bidirectional propagation algorithm simultaneously spreads the known miRNAdisease interactions from the similarity network of both disease and miRNA respectively, making full use of the global network information of miRNA and disease.
Although KNMBP efficiently predicted the unknown miRNAdisease interactions, there are some limitations. First, we tried to build the disease semantic similarity networks and miRNA functional similarity networks by making use of other latest data resources, however, there may be noises and errors in these similarity networks. Secondly, our evaluation is based on the known miRNAdisease interaction which may be not complete. Although the known miRNAdisease interactions have been greatly improved over the previous years, the proportion of these interaction in the total miRNA disease pair is still very low, which leads to some errors in the evaluation of our prediction results.
Conclusion
Studies on the potential miRNAdisease interactions can help people understand the pathogenesis of diseases and design reasonable treatment schemes. In this paper, we proposed a new computational model (KNMBP) to predict the potential miRNAdisease interactions. Compared with other stateoftheart methods, KNMBP not only has higher prediction accuracy on unknown miRNAdisease interaction, but also can effectively find potential interaction of new disease (or miRNA) without any known related miRNA (or disease). Furthermore, the proposed model is not sensitive to parameter. These indicate that our algorithm can integrate multiple omics data of miRNAs and diseases, and have a wide application prospect in miRNA and disease research.
Availability of data and materials
The code and datasets are available at https://github.com/Mayingjun20179/KNMBP. The software is coded in Matlab in Windows system.
Abbreviations
 clusDCA:

Improved Diffusion Component Analysis
 DAG:

Directed Acyclic Graph
 KNMBP:

Kernel neighborhood similarity and multinetwork bidirectional propagation
 KSNS:

Kernelbased neighborhood similarity model
 PLS:

Partial least squares
 SVM:

Support vector machine
 WKNNP:

Weighted kneighborhood profile
References
 1.
Filipowicz W, Bhattacharyya SN, Sonenberg N. Mechanisms of posttranscriptional regulation by microRNAs: are the answers in sight? Nat Rev Genet. 2008;9(2):102–14.
 2.
Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell. 2009;136(2):215–33.
 3.
Shabalina S, Koonin E. Origins and evolution of eukaryotic RNA interference. Trends Ecol Evol. 2008;23(10):578–87.
 4.
Guay C, Roggli E, Nesca V, Jacovetti C, Regazzi R. Diabetes mellitus, a microRNArelated disease? Transl Res. 2011;157(4):253–64.
 5.
NunezIglesias J, Liu CC, Morgan TE, Finch CE, Zhou XJ. Joint genomewide profiling of miRNA and mRNA expression in Alzheimer's disease cortex reveals altered miRNA regulation. PLoS One. 2010;5(2):e8898.
 6.
Catto JWF, Alcaraz A, Bjartell AS, De Vere WR, Evans CP, Fussel S, Hamdy FC, Kallioniemi O, Mengual L, Schlomm T, et al. MicroRNA in prostate, bladder, and kidney Cancer: a systematic review. Eur Urol. 2011;59(5):671–81.
 7.
Poy MN, Hausser J, Trajkovski M, Braun M, Collins S, Rorsman P, Zavolan M. Stoffel M: miR375 maintains normal pancreatic alpha and betacell mass. Proc Natl Acad Sci U S A. 2009;106(14):5813–8.
 8.
Asangani IA, Rasheed SAK, Nikolova DA, Leupold JH, Colburn NH, Post S, Allgayer H. MicroRNA21 (miR21) posttranscriptionally downregulates tumor suppressor Pdcd4 and stimulates invasion, intravasation and metastasis in colorectal cancer. Oncogene. 2008;27(15):2128–36.
 9.
Minn YK, Lee DH, Hyung WJ, Kim JE, Choi J, Yang SH, Song H, Lim BJ, Kim SH. MicroRNA200 family members and ZEB2 are associated with brain metastasis in gastric adenocarcinoma. Int J Oncol. 2014;45(6):2403–10.
 10.
Li Y, Zhang Z, Mao Y, Jin M, Jing F, Ye Z, Chen K. A genetic variant in MiR146a modifies digestive system Cancer risk: a metaanalysis. Asian Pac J Cancer Prev. 2014;15(1):145–50.
 11.
Wang D, Wang J, Lu M, Song F, Cui Q. Inferring the human microRNA functional similarity and functional network based on microRNAassociated diseases. Bioinformatics. 2010;26(13):1644–50.
 12.
Xu J, Li CX, Lv JY, Li YS, Xiao Y, Shao TT, Huo X, Li X, Zou Y, Han QL, et al. Prioritizing candidate disease miRNAs by topological features in the miRNA targetDysregulated network: case study of prostate Cancer. Mol Cancer Ther. 2011;10(10):1857–66.
 13.
Chen X, Liu M, Yan G. RWRMDA: predicting novel human microRNA–disease associations. Mol BioSyst. 2012;8(10):2792–8.
 14.
Chen X, Yan G. Semisupervised learning for potential human microRNAdisease associations inference. Sci RepUK. 2015;4(5501):1–10.
 15.
Chen X, Yang J, Guan N, Li J. GRMDA: graph regression for MiRNAdisease association prediction. Front Physiol. 2018;9(92):1–10.
 16.
Chen X, Xie D, Wang L, Zhao Q, You Z, Liu H. BNPMDA: bipartite network projection for MiRNA–disease association prediction. Bioinformatics. 2018;34(18):3178–86.
 17.
Chen X. WLQJ: predicting miRNAdisease association based on inductive matrix completion. Bioinformatics. 2018;34(24):4256–65.
 18.
Luo J, Xiao Q, Liang C, Ding P. Predicting MicroRNAdisease associations using Kronecker regularized least squares based on heterogeneous Omics data. IEEE Access. 2017;5:2503–13.
 19.
Xiao Q, Luo J, Liang C, Cai J, Ding P. A graph regularized nonnegative matrix factorization method for identifying microRNAdisease associations. Bioinformatics. 2018;34(2):239–48.
 20.
Zhang W, Qu Q, Zhang Y, Wang W. The linear neighborhood propagation method for predicting long noncoding RNA–protein interactions. Neurocomputing. 2018;273:526–34.
 21.
Li Y, Qiu C, Tu J, Geng B, Yang J, Jiang T, Cui Q. HMDD v2.0: a database for experimentally supported human microRNA and disease associations. Nucleic Acids Res. 2013;42(D1):D1070–4.
 22.
Xuan P, Han K, Guo M, Guo Y, Li J, Ding J, Liu Y, Dai Q, Li J, Teng Z, et al. Prediction of microRNAs associated with human diseases based on weighted kMost similar neighbors. PLoS One. 2013;8(8):e70204.
 23.
Lu M, Zhang Q, Deng M, Miao J, Guo Y, Gao W, Cui Q. An analysis of human MicroRNA and disease associations. PLoS One. 2008;3(10):e3420.
 24.
Davis AP, Grondin CJ, Johnson RJ, Sciaky D, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The comparative Toxicogenomics database: update 2019. Nucleic Acids Res. 2019;47(D1):D948–54.
 25.
Karagkouni D, Paraskevopoulou MD, Chatzopoulos S, Vlachos IS, Tastsoglou S, Kanellos I, Papadimitriou D, Kavakiotis I, Maniou S, Skoufos G, et al. DIANATarBase v8: a decadelong collection of experimentally supported miRNA–gene interactions. Nucleic Acids Res. 2018;46(D1):D239–45.
 26.
Chou C, Shrestha S, Yang C, Chang N, Lin Y, Liao K, Huang W, Sun T, Tu S, Lee W, et al. miRTarBase update 2018: a resource for experimentally validated microRNAtarget interactions. Nucleic Acids Res. 2018;46(D1):D296–302.
 27.
Hsu SD, Chu CH, Tsou AP, Chen SJ, Chen HC, PWC H, Wong YH, Chen YH, Chen GH, Huang HD. miRNAMap 2.0: genomic maps of microRNAs in metazoan genomes. Nucleic Acids Res. 2007;36(Database):D165–9.
 28.
Xiao F, Zuo Z, Cai G, Kang S, Gao X, Li T. miRecords: an integrated resource for microRNAtarget interactions. Nucleic Acids Res. 2009;37(Database):D105–10.
 29.
Kozomara A, Birgaoanu M, GriffithsJones S. miRBase: from microRNA sequences to function. Nucleic Acids Res. 2019;47(D1):D155–62.
 30.
Lee I, Blom UM, Wang PI, Shim JE, Marcotte EM. Prioritizing candidate disease genes by networkbased boosting of genomewide association data. Genome Res. 2011;21(7):1109–21.
 31.
Huang Z, Shi J, Gao Y, Cui C, Zhang S, Li J, Zhou Y, Cui Q. HMDD v3.0: a database for experimentally supported human microRNA–disease associations. Nucleic Acids Res. 2019;47(D1):D1013–7.
 32.
Hu Y, Zhao T, Zhang N, Zang T, Zhang J, Cheng L. Identifying diseasesrelated metabolites using random walk. BMC Bioinformatics. 2018;19(S5):37–46.
 33.
Deng L, Ye D, Zhao J, Zhang J. Exploring Disease Similarity by Integrating Multiple Data Sources. In: In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Madrid: IEEE; 2018. p. 85358.
 34.
Wang S, Cho H, Zhai C, Berger B, Peng J. Exploiting ontology graph for predicting sparsely annotated gene function. Bioinformatics. 2015;31(12):i357–64.
 35.
Ma Y, Yu L, He T, Hu X, Jiang X. Prediction of long noncoding RNAprotein interaction through kernel softneighborhood similarity. In: In 2018 IEEE international conference on Bioinformatics and biomedicine (BIBM). Madrid: IEEE; 2018. p. 193–6.
 36.
Liu Y, Wu M, Miao C, Zhao P, Li X. Neighborhood regularized logistic matrix factorization for drugtarget interaction prediction. PLoS Comput Biol. 2016;12(2):e1004760.
 37.
Xie B, Ding Q, Han H, Wu D. miRCancer: a microRNAcancer association database constructed by text mining on literature. Bioinformatics. 2013;29(5):638–44.
 38.
Yang Z, Wu L, Wang A, Tang W, Zhao Y, Zhao H, Teschendorff AE. dbDEMC 2.0: updated database of differentially expressed miRNAs in human cancers. Nucleic Acids Res. 2017;45(D1):D812–8.
Acknowledgements
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of this paper.
About this supplement
This article has been published as part of BMC Medical Genomics Volume 12 Supplement 10, 2019: Selected articles from the IEEE BIBM International Conference on Bioinformatics & Biomedicine (BIBM) 2018: medical genomics. The full contents of the supplement are available online at https://bmcmedgenomics.biomedcentral.com/articles/supplements/volume12supplement10.
Funding
The research was supported by the National Key Research and Development Program of China (2017YFC0909502), the National Natural Science Foundation of China (61532008, 61872157). Specifically, the publication costs are funded by the National Key Research and Development Program of China (2017YFC0909502).
Author information
Affiliations
Contributions
YM and XJ designed the MiRNAdisease interaction prediction based on kernel neighborhood similarity and multinetwork bidirectional propagation. YM and XJ designed experiments and wrote the manuscript. LG provided biological background guidance. CZ and TH participated in the discussion of the model and gives some suggestions. TH supervised and helped conceive the study. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1.
Details of the two benchmark data sets in the paper.
Additional file 2.
The optimal parameters and the optimal AUC values of different experimental settings were performed on two benchmark data sets.
Additional file 3.
The influence of nonneighborhood control parameter μ_{1} and similarity regularization parameter μ_{2} on the predictive performance of the model.
Additional file 4.
The prediction scores of 199 new diseases and candidate mirnas sorted by score were obtained using the data set extracted from the old version HMDB.
Additional file 5.
The top 10 candidate miRNAs of the four diseases predicted by KNMBP based on the old version.
Additional file 6.
The candidate miRNAs of 579 diseases were sequenced according to the predicted score using the data set extracted from the new version of HMDB.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Ma, Y., He, T., Ge, L. et al. MiRNAdisease interaction prediction based on kernel neighborhood similarity and multinetwork bidirectional propagation. BMC Med Genomics 12, 185 (2019). https://doi.org/10.1186/s1292001906224
Published:
Keywords
 MicroRNAdisease interaction
 Heterogeneous omics data
 Kernel neighborhood similarity
 Bidirectional propagation
 Diffusion component analysis