### Materials

In this paper, we used datasets which came from the study of Zhao et al. [7]. We downloaded and used the Additional files 1, 2, 3, 4, and 5 from this study. These datasets contain 190 diseases, 111 lncRNAs and 264 miRNAs as described as follows:

#### Known lncRNA-miRNA associations

The known lncRNA-miRNA associations were collected from the starBasev2.0 [22] in February, 2017 and provided the most comprehensive experimentally confirmed lncRNA-miRNA interactions based on large-scale CLIP-Seq data. After eliminating duplicate values and erroneous data and also removing lncRNAs not included in DS2 dataset, we obtained the DS1 dataset which contains 1880 known lncRNA-miRNA associations.

#### Known lncRNA-disease associations

The known lncRNA-disease associations were collected from 8842 known disease-lncRNA associations in the MNDR database [23] and 2934 known disease-lncRNA associations in the LncRNADisease database [24]. After eliminating diseases without any MeSH descriptors because the disease names came from two different databases, merging the diseases with the same MeSH descriptors and removing the lncRNAs which were not included in the lncRNA-miRNA dataset (DS1), 936 known associations between diseases and lncRNAs (DS2) remained.

#### Known disease-miRNA associations

The known human miRNA-disease associations were downloaded from the HMDD V2.0 database [25]. This dataset (DS3) contains 3252 quality miRNA-disease associations after we eliminated the duplicate associations and miRNA-disease associations involving with other diseases or lncRNAs which were not contained in the DS1 or DS2 datasets.

### Method overview

In this paper, we proposed a new method to infer miRNA-disease associations. The flowchart of the proposed method is illustrated in Fig. 1. Generally, our proposed method contains four main stages. At the first stage, we constructed a tripartite graph G^{0} based on known miRNA-disease associations, known lncRNA-disease associations, and known miRNA-lncRNA interactions. The tripartite graph G^{0} is represented by three adjacency matrices: *A*^{0}_{MD,}* A*^{0}_{ML} and *A*^{0}_{DL} where *A*^{0}_{MD} is the adjacency matrix between miRNAs and diseases, *A*^{0}_{ML} is the adjacency matrix between miRNAs and lncRNAs, *A*^{0}_{DL} is the adjacency matrix between diseases and lncRNAs. During the second stage, to solve the imbalance data problem, we employed a collaborative filtering algorithm on the tripartite graph G^{0} to obtain a tripartite graph G^{u}. The tripartite graph G^{u} is represented by three adjacency matrices: *A*^{u}_{MD,}* A*^{u}_{ML} and *A*^{0}_{DL} where *A*^{u}_{MD,}* A*^{u}_{ML} are the adjacency matrices obtained by updating *A*^{0}_{MD} and *A*^{0}_{ML} after using collaborative filtering algorithm. The tripartite graph G^{u} is used in a resource allocation algorithm at the third stage to calculate final resource score *(Rscore_final)* of miRNA candidates for each disease. At the final stage, we ranked all miRNA candidates’ *Rscore_final* for each disease in descending order so that the candidate with greater *Rscore_final* will have higher possibility to be verified in the future.

### Construction of a tripartite graph G^{0}

Inspired by previous studies [19, 20] to infer lncRNA-disease associations by using a tripartite graph, in this paper, we firstly construct a miRNA-disease-lncRNA tripartite graph G^{0} as follows:

#### Construction of known miRNA-disease association graph

Let *M* = *{m*_{k}*; k* = *1,…,n*_{m}*}* denotes the set of miRNAs, *D* = *{d*_{j}*; j* = 1,…, *n*_{d}*}* denotes the set of diseases where *n*_{m}*, n*_{d} represent the number of miRNAs and diseases, respectively. We build a MD^{0} graph based on the known miRNA-disease associations. The MD^{0} graph is represented by a matrix *A*^{0}_{MD} which is the adjacency matrix of known miRNA-disease associations. The entity *A*^{0}_{MD}*(m*_{k}*, d*_{j}*)* is the element in *k*th row and *j*th column of *A*^{0}_{MD}, and *A*^{0}_{MD}*(m*_{k}*, d*_{j}*)* = *1* if miRNA m_{k} is associated with disease *d*_{j}, otherwise, *A*^{0}_{MD}*(m*_{k}*, d*_{j}*)* = *0*.

#### Construction of known miRNA-lncRNA interaction graph

In the same way, let *M* = *{m*_{k}*; k* = *1,…,n*_{m}*}* denotes the set of miRNAs, *L* = *{l*_{i}*; i* = *1,…, n*_{l}*}* denotes the set of lncRNAs where *n*_{m}*, n*_{l} represent number of miRNAs and lncRNAs, respectively. We can obtain *ML*^{0} graph and *A*^{0}_{ML} matrix. *ML*^{0} graph is built on known miRNA-lncRNA interactions. *A*^{0}_{ML} is the adjacency matrix of known miRNA-lncRNA interactions. The entity *A*^{0}_{ML}*(m*_{k}*, l*_{i}*)* is the element in *k*th row and *i*th column of *A*^{0}_{ML}, and *A*^{0}_{ML}*(m*_{k}*, l*_{i}*)* = *1* if miRNA *m*_{k} interacts with lncRNA *l*_{i}, otherwise, *A*^{0}_{ML}*(m*_{k}*, l*_{i}*)* = *0.*

#### Construction of known disease-lncRNA association graph

Similarly, let *D* = *{d*_{j}*; j* = *1,…, n*_{d}*}* denotes the set of diseases, *L* = *{l*_{i}*; i* = *1,…,n*_{l}*}* denotes the set of lncRNAs, where *n*_{d}*, n*_{l} represent number of diseases and lncRNAs, respectively. We can obtain *DL*^{0} graph and *A*^{0}_{DL} matrix where *DL*^{0} graph is built on known disease-lncRNA associations and *A*^{0}_{DL} is the adjacency matrix of known disease-lncRNA associations. The entity *A*^{0}_{DL}*(d*_{j}*, l*_{i}*)* is the element in *j*th row and *i*th column of *A*^{0}_{DL}, and *A*^{0}_{DL}*(d*_{j}*, l*_{i}*)* = *1* if disease *d*_{j} is associated with lncRNA *l*_{i}, otherwise, *A*^{0}_{DL}*(d*_{j}*, l*_{i}*)* = *0*.

#### Construction of a tripartite graph G^{0}

From the integration of the three *MD*^{0}*, ML*^{0}*, DL*^{0} graphs, we obtain a tripartite graph G^{0}*.* The tripartite graph G^{0} is represented by three adjacency matrices: *A*^{0}_{MD,}* A*^{0}_{ML} and *A*^{0}_{DL} as mentioned before.

### Construction of a tripartite graph *G*
^{u}

In the tripartite graph G^{0}, the number of known associations between miRNAs and diseases as well as between miRNAs and lncRNAs are small. So that, for any given lncRNA node *l*_{i} and disease node *d*_{j}, it is clear that the number of miRNA nodes which associated with both *l*_{i} and *d*_{j} will be very small. To improve it, in our method, we use a collaborative filtering algorithm for recommending suitable miRNA nodes to corresponding lncRNA nodes and disease nodes, respectively. By considering that a recommender system may involve various input data including users and items [18], in our proposed method, we take lncRNAs and diseases as users, while miRNAs as items. For the two adjacency matrices *A*^{0}_{ML} and *A*^{0}_{MD} obtained above, it is easy for us to construct another adjacency matrix *A*^{0}_{MLD} = *[A*^{0}_{ML}*, A*^{0}_{MD}*]* by splicing *A*^{0}_{ML} and *A*^{0}_{MD} together because the number of rows in both *A*^{0}_{ML} and *A*^{0}_{MD} are same. It is clear that the row vector of *A*^{0}_{MLD} consists of the row vectors in *A*^{0}_{ML} and *A*^{0}_{MD} while the column vectors in *A*^{0}_{MLD} is the same as the column vectors in *A*^{0}_{ML} or *A*^{0}_{MD}*.*

On the basis of *A*^{0}_{MLD} and tripartite graph G^{0}, we can obtain a co-occurrence matrix *R*^{m x m}, in which, the entity *R(m*_{k}*, m*_{r}*)* indicates the element in *k*^{th} row and *r*^{th} column of *R*^{m x m} where *R(m*_{k}*, m*_{r}*)* = *1* if and only if the miRNA *m*_{k} and miRNA *m*_{r} have at least one common neighboring node in *G*^{0}, otherwise *R(m*_{k}*, m*_{r}*)* = *0*. The common neighboring node can be an lncRNA or a disease in *G*^{0}. So, a similarity matrix *R*^{nor} can be calculated by normalizing *R*^{m x m} as the following equation:

$${\mathrm{R}}^{nor}\left({m}_{k}, {m}_{r}\right)=\frac{\left|N\left({m}_{k}\right)\bigcap N({m}_{r})\right|}{\sqrt{\left|N({m}_{k})\right|*\left|N({m}_{r})\right|}}$$

(1)

where *k, r* are the number of miRNAs. \(\left|N\left({m}_{k}\right)\right|\) indicates the number of known lncRNAs and diseases associated to *m*_{k} in *G*^{0}, which means the number of elements with value equaling to 1 in *k*th row of *A*^{0}_{MLD}. \(\left|N\left({m}_{r}\right)\right|\) indicates the number of known lncRNAs and diseases associated to *m*_{r} in *G*^{0}, which means the number of elements with value equaling to 1 in *r*th row of *A*^{0}_{MLD}. ∣N(m_{k}) ∩ N(m_{r})∣ indicates the number of known lncRNAs and diseases associated with both miRNA *m*_{k} and miRNA *m*_{r} simultaneously in *G*^{0}.

Based on the similarity matrix R^{nor} and the adjacency matrix *A*^{0}_{MLD}, we calculate a new recommender matrix *A*^{u}_{MLD} as follows:

$$A^{u}_{MLD} = \, R^{nor} * \, A^{0}_{MLD}$$

(2)

Specifically, for a particular lncRNA *l*_{i} or disease *d*_{j} in *G*^{0}, if there is a miRNA *m*_{k} satifying *A*^{0}_{MLD}*(m*_{k}*, l*_{i}*)* = *1* or *A*^{0}_{MLD}*(m*_{k}*, d*_{j}*)* = *1* in *A*^{0}_{MLD}, then we firstly calculate the sum of the values of all elements in the *i*th or *j*th column in *A*^{u}_{MLD}, respectively. Therefore, we will have its averaged value *P*. Next, if the *i*th or *j*th column of *A*^{u}_{MLD} contains a miRNA \({m}_{\theta }\) which satisfies *A*^{u}_{MLD}*(*\({m}_{\theta }\)*, l*_{i}*)* > *P** or A*^{u}_{MLD}*(*\({m}_{\theta }\)*, d*_{j}*)* > *P* then we recommend miRNA \({m}_{\theta }\) for lncRNA *l*_{i} or disease *d*_{j}, respectively. Also, we will add new edge between \({m}_{\theta }\) and *l*_{i} or \({m}_{\theta }\) and *d*_{j} into the tripartite graph G^{0}.

Finally, we obtain a tripartite graph *G*^{u}. The tripartite graph *G*^{u} contains three graphs: *MD*^{update}*, ML*^{update} and *DL*^{0} and can be represented by three adjacency matrices: *A*^{u}_{MD}*, A*^{u}_{ML}* and A*^{0}_{DL}*. MD*^{update} is the updated graph of *MD*^{0} after adding new edge between recommended miRNAs and diseases. *ML*^{update} is the updated graph of *ML*^{0} after adding new edge between recommended miRNAs and lncRNAs. *A*^{u}_{MD} is the adjacency matrix which represents *MD*^{update} graph. It contains 10,310 known and recommended associations and 39,850 unknown remained associations. *A*^{u}_{ML} is the adjacency matrix which represents *ML*^{update} graph.

### Employing resource allocation process on the tripartite graph *G*
^{u} to infer miRNA-disease associations

To infer miRNA-disease association, we employ the resource allocation algorithm on the tripartite graph *G*^{u} as described in the following steps:

*Step 1*: Calculating resource allocation between miRNAs and diseases

For a specific miRNA m_{k}, we define the initial resources located on disease *d*_{j} as:

$$fd\left( {m_{k} } \right) = A^{u}_{MD} \left( {m_{k} , \, d_{j} } \right),\quad \, j = 1,2, \ldots ,n_{d}$$

(3)

where *n*_{d} is the number of diseases.

Then we calculate the resource moved back from *D* to *M* by using a weight matrix *W* = *{w*_{kt}*}n*_{m x}* n*_{m} to indicate the resource allocation process between miRNAs and diseases as follows:

$$w_{kt} = \frac{1}{{\deg A_{MD}^{u} \left( {m_{k} } \right)}}*\mathop \sum \limits_{j = 1}^{{n_{d} }} \frac{{A_{MD }^{u} \left( {m_{k} , d_{j} } \right) * A_{MD }^{u} \left( {m_{t} , d_{j} } \right)}}{{\deg A_{MD}^{u} \left( {d_{j} } \right)}}$$

(4)

where \({w}_{kt}\) is the contribution resource moved from *t*th node to *k*th node in *M*, and it can be understood as the similarity between miRNA *m*_{k} and miRNA *m*_{t} in *MD*^{update} graph. \(\mathit{deg}{A}_{MD}^{u}\left({m}_{k}\right)\) is the degree of miRNA *m*_{k} in *MD*^{update} graph and it represents the number of associated diseases for miRNA *m*_{k}. Similarly, \(\mathit{deg}{A}_{MD}^{u}\left({d}_{j}\right)\) is the degree of disease *d*_{j} in *MD*^{update} graph and it represents the number of associated miRNAs for disease *d*_{j}.

With respect to previous study [20], we also modify the resource allocation algorithm by considering the level of consistency between the contribution of resource transferred in both directions. It shows the impact of co-selection *(m*_{k}*, m*_{t}*)* between the contribution of resource from *m*_{k} to *m*_{t} and the contribution of resource from *m*_{t} to *m*_{k}. A consistence-based resource allocation to represent a final miRNA-disease weight matrix *W’* = *{w’*_{kt}*}* can be defined as in the following equation:

$$W_{kt}^{^{\prime}} = W_{kt} + \frac{{W_{tk} }}{{\mathop \sum \nolimits_{s = 1}^{{n_{m} }} W_{sk} }}$$

(5)

From the combination of the final miRNA-disease weight matrix *W’* and the adjacency matrix *A*^{u}_{MD}, we define a final resource *Rscore_ondisease_1* located on *D* as follows:

$$Rscore\_ondisease\_1 = W^{{\prime }} *A^{u}_{MD}$$

(6)

*Step 2*: Calculating resource allocation between diseases and lncRNAs

In regard to resource allocation between genes and diseases in TPGLDA [20], the same initial resources located on *M* nodes are allocated from nodes in *M* to nodes in *D* and then moved back, and the final resource matrix *Rscore_ondisease_2* located on *D* nodes are issued by:

$$Rscore\_ondisease\_2 = \mathop \sum \limits_{s = 1}^{{n_{l} }} \frac{{A_{DL }^{0} \left( {d_{j} , l_{s} } \right) }}{{\deg A_{DL}^{0} \left( {l_{i} } \right)}}*\mathop \sum \limits_{k = 1}^{{n_{d} }} \frac{{A_{MD}^{u} \left( { m_{k} , d_{j} } \right)}}{{\deg A_{DL}^{0} \left( {d_{j} } \right)}}$$

(7)

where \(\mathrm{deg}{A}_{DL}^{0}\left({l}_{i}\right)={\sum }_{j=1}^{{n}_{d}}{A}_{DL}^{0}({d}_{j}, {l}_{i})\) is the number of related diseases for lncRNA *l*_{i} or the degree of lncRNA *l*_{i} in *DL*^{0} graph. \(\mathrm{deg}{A}_{DL}^{0}\left({d}_{j}\right)\)=\({\sum }_{i=1}^{{n}_{l}}{A}_{DL}^{0}({d}_{j}, {l}_{i})\) is the number of related lncRNAs for disease *d*_{j} or the degree of disease *d*_{j} in *DL*^{0} graph.

*Step 3*: Calculating the final resource score *Rscore_final* to infer the potential disease-related miRNAs

We calculate the final resource score *Rscore_final* which is used to measure latent disease-related miRNAs as follows:

$$Rscore\_final = \gamma * Rscore\_ondisease\_1 + \, \left( {1 - \gamma } \right) \, *Rscore\_ondisease\_2$$

(8)

where *γ* is a tunable parameter with value in [0, 1]. Our model achieves the best prediction performance when *γ* = 0*.9.*

### Ranking all candidate miRNAs’ Rscores for each disease in descending order

Finally, we sort all candidate miRNAs’ *Rscore_final* for each disease in descending order so that a higher score candidate will have more chances to be verified in the future.