- Research
- Open Access
Gene Ontology-based function prediction of long non-coding RNAs using bi-random walk
- Jingpu Zhang1, 2,
- shuai Zou2 and
- Lei Deng3Email author
https://doi.org/10.1186/s12920-018-0414-2
© The Author(s) 2018
- Published: 20 November 2018
Abstract
Background
With the development of sequencing technology, more and more long non-coding RNAs (lncRNAs) have been identified. Some lncRNAs have been confirmed that they play an important role in the process of development through the dosage compensation effect, epigenetic regulation, cell differentiation regulation and other aspects. However, the majority of the lncRNAs have not been functionally characterized. Explore the function of lncRNAs and the regulatory network has become a hot research topic currently.
Methods
In the work, a network-based model named BiRWLGO is developed. The ultimate goal is to predict the probable functions for lncRNAs at large scale. The new model starts with building a global network composed of three networks: lncRNA similarity network, lncRNA-protein association network and protein-protein interaction (PPI) network. After that, it utilizes bi-random walk algorithm to explore the similarities between lncRNAs and proteins. Finally, we can annotate an lncRNA with the Gene Ontology (GO) terms according to its neighboring proteins.
Results
We compare the performance of BiRWLGO with the state-of-the-art models on a manually annotated lncRNA benchmark with known GO terms. The experimental results assert that BiRWLGO outperforms other methods in terms of both maximum F-measure (Fmax) and coverage.
Conclusions
BiRWLGO is a relatively efficient method to predict the functions of lncRNA. When protein interaction data is integrated, the predictive performance of BiRWLGO gains a great improvement.
Keywords
- lncRNA
- Function annotation
- Bi-random
Background
The results of the entire human genome sequencing show that only 1.5-2.0% of genes code for proteins. The remaining genes correspond to large non-coding protein regions, which include amounts of transcriptional regulatory elements and non-coding RNA genes. Generally, non-coding RNAs are not capable of encoding proteins [1]. According to the number of bases, non-coding RNAs are divided into long non-coding RNA (lncRNA) and small non-coding RNA (sncRNA). LncRNAs are more than 200 nt in length and highly conserved in their secondary and tertiary structures [2]. With the rapid development of high through-put deep sequencing technology, more and more lncRNAs have been discovered in eukarya in recent years. Especially there is large number of lncRNAs are found in humans and mice [3, 4]. lncRNAs take part in many important regulational processes, such as X chromosome silence, genomic imprinting, chromatin modification, transcription activation, transcription interference, nuclear transport etc [5–7]. Many recent studies have reported that lncRNAs are closely related with occurrence, development, diagnosis and treatment of the disease [8, 9].
With the development of lncRNA research, amounts of data related to lncRNAs emerged. In order to make better use of these information, lots of bioinformatics databases have been built up. These databases contain information about lncRNAs, including structure information, expression information, interaction information of lncRNAs and other relevant information. They play an important role in the research of lncRNAs. Moreover, the data curated by these databases may contribute to research of lncRNAs with computational methods. A brief description of some databases is outlined as follows. NONCODE provides ncRNA related information for 17 species. The information not only includes the basic information of lncRNA such as location, strand, exon number, length and sequence, but also the advanced information such as the expression profiles, conservation info, predicted function and disease relation [10]. LncRNAdb curates the experimentally supported functional lncRNAs [11]. Entries in LncRNAdb are manually curated from referenced literature. ChIPBase aims to explore the transcriptional regulatory networks of ncRNAs and protein-coding genes according to the ChIP-Seq data [12]. lncRNome is a comprehensive searchable biologically oriented knowledgebase for lncRNAs in Humans, which provides various information including chromosomal locations, the types, description on the biological functions and disease associations of lncRNAs [13]. LncRNADisease provides experimentally supported lncRNA-disease associations, which contains approximately 480 entries of high-quality associations [14]. Besides these databases mentioned above, there are still a number of resources about lncRNA, such as GeneCards [15], lncRNASNP [16], lncRNAMap [17], and LncRNA2Target [18] etc.
Although many databases which provide a wide variety of information about lncRNAs have been developed, there are few databases which are focused on function annotation of lncRNAs. Therefore, the functional investigation of lncRNAs has attracted the attentions of many biologists and bioinformaticians [19]. However, sophisticated molecular regulatory mechanisms of lncRNAs remain an enigma. At present, there are still a lot of obstacles to determine the functions of lncRNAs. Biological experiments are the mainly methods to identify the functions of lncRNAs. However, it has the limits with higher cost and time-consuming. In recent years, researchers have developed several computational methods to infer lncRNA functions [20]. Guo et al. [21] proposed a network-based approach, lnc-GFP, to annotate lncRNAs. In lnc-GFP, a bi-colored biological network is built firstly according to co-expression data and protein interaction data, then lncRNAs are annotated by running a global propagation algorithm on the bi-colored network. Jiang et al. [22] developed a method named LncRNA2Function which utilizes hyper-geometric test to predict lncRNA functions. Recently, Zhang et al. [23] calculated the neighboring protein-coding genes of each lncRNA according to the KATZ measure and predicted the functions of lncRNAs in terms of their neighboring genes.
This work is motivated by the promising performance of bi-random walk in predicting the disease-gene association [24, 25] and protein function [26]. In this work, a global network-based approach, BiRWLGO, is proposed to predict potential functions of lncRNAs at large scale. In BiRWLGO, a global network is built by integrating the lncRNA similarity network, the protein-protein interaction (PPI) network and lncRNA-protein associations. Then, the probability score of each lncRNA-protein pair is obtained from applying the bi-random walk algorithm on the global network. Finally, the functions of a query lncRNA can be predicted according to its neighboring proteins. To evaluate the performance of the proposed model, an independent test is performed on the manually annotated 55 lncRNAs with 129 GO terms. Furthermore, we compare the new model with three state-of-the-art models: lnc-GFP [21], LncRNA2Function [22] and KATZLGO [23]. The experimental results show that BiRWLGO achieves F-measure value of 0.345 and outperforms the prediction performance of the other three models. Moreover, case studies also demonstrate the superiority of BiRWLGO on the prediction of the potential functions of lncRNAs.
Methods
LncRNA co-expression similarity
The expression profiles of lncRNAs are downloaded from NONCONDE 2016 database [10] that includes the expression profiles of 90062 lncRNAs in 24 human tissues or cells. The evaluation of lncRNA co-expression similarity is conducted by calculating Pearson’s correlation coefficient. And according to the results obtained, we successfully establish the lncRNA similarity network.
Protein-protein interaction
The PPI data are obtained from STRING V10.0 [27], a database covering data about more than 2000 organisms. The interactions in the database are curated according to high-throughput screening, computational prediction, and information retrieval.
LncRNA-protein associations
- 1
Co-expression data from COXPRESdb [29]
COXPRESdb reveals the relationships between co-expressed genes in animal species, e.g. human, mouse and fly [28]. From this database, we firstly extract three preprocessed co-expression datasets for human species (Hsa.c4-1, Hsa2.c2-0 and Hsa3.c1-0), including pre-calculated pairwise Pearson’s correlation coefficients (PCC). The correlations are calculated according to the following formula:Here, C(l,p) represents the overall correlation between lncRNA l and protein-coding gene p, Ck(l,p) represents the correlation score between l and p in dataset k, and K is the number of datasets where l and p are positively correlated. The gene pairs with negative correlation scores are excluded.$$ C(l,p)=1-\prod_{k=1}^{K}(1-C_{k} (l,p))\ \ if ~C_{k} (l,p)>0 $$ - 2
Co-expression data from ArrayExpress [30] and GEO
The co-expression data is extracted from the research of Jiang et al. [22]. The raw RNA-Seq data in 19 human normal tissues are downloaded from ArrayExpress (accession no.E-MTAB-513) and GEO (accession no.GSE30554), respectively. Then, the expression levels of all human lncRNAs and protein-coding genes are calculated through Tophat and cufflinks with the default parameters. The co-expression of lncRNA-protein pairs is evaluated by computing the Pearson’s correlation coefficients.
- 3
LncRNA-protein interaction data from NPinter [31]
The known interactions between lncRNAs and proteins are obtained from NPinter v3.0, which contains 491416 experimentally-verified interactions between ncRNAs and other biomolecules. After that, the lncRNA-protein interaction pairs are filtered by restricting the target organisms to “Homo sapiens”. The interactions between lncRNA and protein can be denoted by an binary matrix, each element of which represents whether there is an interaction between an lncRNA and a protein.
The Gene Ontology annotation
So far, the functions of lncRNAs have not been manually annotated. Hence, in our study, lncRNAs are indirectly annotated according to the existing annotations of proteins. The proteins and their annotations are obtained from the Gene Ontology Annotation (GOA) database [32].
The BiRWLGO method
CBGs with different length in the lncRNA-protein association network
The bi-random walk approach proposed can be used to discover the lncRNA-protein correlations by capturing the CBG patterns in the lncRNA similarity network and protein interaction network. In the algorithm, the degree of correlation between an lncRNA and a protein is evaluated by its distance to the other associations in the lncRNA similarity network and protein interaction network. Hence, the bi-random walk is a global method to conduct the association map.
Flowchart of BiRWLGO. It incudes three steps: a) build the global network; b) run the bi-random walk algorithm on the global network; c) annotate lncRNAs with GO terms according to their high ranked neighboring protein-coding genes
Here, α refers to the decay factor. RL and RP refer to the correlations between lncRNAs and proteins based on the walk on these two networks respectively. Theoretically, the iterative process on two networks could converge to a unique solution and the probability in steady state is defined as the correlation score between an lncRNA and a protein. The algorithm is outlined as Algorithm 1.
In Algorithm 1, DL and DP are both diagonal matrix with diagonal elements \({D_{L}}(i,i){{=}}{\sum }_{j} {{L_{ij}}} \) and \({D_{P}}(i,i){\sum }_{j} {{P_{ij}}} \) respectively. The result of sum(A) is a vector where the entry i is defined as \({\sum }_{j} {{A_{ij}}} \). The algorithm will end as it reaches the maximum number of iterations. Finally, the association probability score matrix Rt is acquired, which can represent the relevance probabilities between all lncRNA-protein pairs.
Results
Benchmarks
Since the golden-standard dataset of human lncRNA functions has not been established, we first manually annotate 55 lncRNAs with 129 GO terms as the independent test set(lncRNA2GO-55). In lncRNA2GO-55, the lncRNAs are functionally described based on the results from knockdown or overexpression experiments. In these annotations, referenced information about lncRNAs is included, including sequences, structures, genomic context, expression, subcellular localization, conservation, functional evidence etc. The dataset is presented in Additional file 1.
Evaluation measures
Moreover, coverage is employed to evaluate these methods. It is defined as the ratio of the portion of lncRNAs which are correctly annotated with GO terms to the whole number of lncRNAs.
Parameter selection
There are four parameters (α, l, r and N) to be tuned in BiRWLGO. The parameter α is the decay factor, which is introduced to dampen the importance of a CBG when its path is being longer. The parameters l and r are employed to limit the number of random walk steps in the lncRNA similarity network and the protein interaction network respectively. A specific lncRNA is annotated according to the GO terms of its top N neighboring proteins in Rt in descending order. Therefore, N may have an effect on the functional annotations of lncRNAs.
The values of Fmax when varying N from 20 to 80. The predictive performance of BiRWLGO is sensitive to the actual choice of N and the Fmax comes to the max value when N equals 47
The Fmax values when α ranges in [0.2, 0.9]
α | 0.2 | 0.3 | 0.4 | 0.5 |
F max | 0.294 | 0.305 | 0.298 | 0.301 |
α | 0.6 | 0.7 | 0.8 | 0.9 |
F max | 0.299 | 0.300 | 0.319 | 0.315 |
The effects of protein interaction data
The Fmax scores when BiRWLGO is tested on three different network configurations
Performances
Generally, the methods for investigating lncRNA functions are commonly based on ‘guilt-by-association’ from co-expression patterns, namely lncNRAs share similar functions with their protein-coding counterparts [37]. Among these methods, lnc-GFP is aimed to massively annotate the potential functions of lncRNAs. According to gene expression profiles and PPI data, a coding-non-coding bi-colored biological network is established. Then a global propagation algorithm is employed to run on the network to predict the possible functions of unannotated lncRNAs [21]. LncRNA2Function is a statistical approach, which predicts the interested functions according to the correlation between lncRNA expression and expression of protein-coding genes by the hypergeometric test [22]. Recently, Zhang et al. developed a global method, KATZLGO, which can achieve massive prediction of lncRNA functions by integrating multiple biological networks. In KATZLGO, a query lncRNA is annotated according to the GO terms of its neighboring proteins, while the associations between the lncRNA and proteins are calculated based on the KATZ measure [23].
Performance comparison with other methods
The numbers of lncRNAs correctly annotated by different methods
Methods | Unannotated | Annotated |
---|---|---|
lnc-GFP | 22 | 23 |
lncRNA2Function | 37 | 18 |
KATZLGO | 10 | 45 |
BiRWLGO | 8 | 47 |
Case studies
In order to illustrate the prediction ability of BiRWLGO for inferring the potential functions of lncRNAs, we performed case studies in this section. The functions for each selected lncRNAs were confirmed by the literatures.
Case study 1: GHET1. GHET1, gastric carcinoma high expressed transcript 1, is located in an intergenic region on chromatin 7. Yang et al. [38] investigated the biological function of GHET1 in gastric carcinoma. Their results demonstrated that GHET1 promoted gastric carcinoma cell proliferation, specifically increases the stability of c-Myc mRNA and up-regulates its expression. In the clinical analyzing, compared with adjacent tissues, the GHET1 gene and protein expressions were significantly increased in the gastric cancer tissues. In the cell experiment, down-regulation of GHET1 had suppressed the cell proliferation, invasion and migration activities and enhanced the cell apoptosis and G1 phase [39].
The top 20 predicted GO biological process terms for lncRNA GHET1 by BiRWLGO
ID | GO term | GO name |
---|---|---|
1 | GO:0070934 | CRD-mediated mRNA stabilization |
2 | GO:0006417 | Regulation of translation |
3 | GO:0006810 | Transport |
4 | GO:0017148 | Negative regulation of translation |
5 | GO:0010467 | Gene expression |
6 | GO:0051028 | Regulation of cytokine biosynthetic process |
7 | GO:0042035 | Regulation of cytokine biosynthetic process |
8 | GO:0097150 | Neuronal stem cell population maintenance |
9 | GO:0010610 | Regulation of mRNA stability involved in response to stress |
10 | GO:0006403 | RNA localization |
11 | GO:0022013 | Pallium cell proliferation in forebrain |
12 | GO:0006355 | Regulation of transcription, DNA-templated |
13 | GO:0008380 | RNA splicing |
14 | GO:0006397 | mRNA processing |
15 | GO:0000398 | mRNA splicing, via spliceosome |
16 | GO:0007165 | Signal transduction |
17 | GO:0042981 | Regulation of apoptotic process |
18 | GO:0045944 | Positive regulation of transcription from RNA polymerase II promoter |
19 | GO:0010628 | Positive regulation of gene expression |
20 | GO:0001501 | Skeletal system development |
Case study 2: HOTAIRM1. HOTAIRM1 is located between the HOXA1 and HOXA2 genes and expressed specifically in cells of a myeloid lineage [40]. It can play a regulatory role in myeloid maturation by modulating integrin-controlled cell cycle progression at the gene expression level [41]. In the research of Wan et al. [42], HOTAIRM1 expression was drastically reduced in colorectal cancer tissues compared with matched normal tissues. Moreover, the knockdown of HOTAIRM1 promoted colorectal cell proliferation and over-expression of HOTAIRM1 repressed cell proliferation. It meant that HOTAIRM1 played a role of tumour suppressor in colorectal cancer. Xin et al. [43] demonstrated that HOTAIRM1 competitively bound to miR-3960 and finally regulated the process of hematopoiesis, which revealed a novel regulatory mechanism of lncRNA function.
The top 20 predicted GO biological process terms for lncRNA HOTAIRM1 by BiRWLGO
ID | GO term | GO name |
---|---|---|
1 | GO:0006355 | Regulation of transcription, DNA-templated |
2 | GO:0006351 | Transcription, DNA-templated |
3 | GO:0007049 | Cell cycle |
4 | GO:0006397 | mRNA processing |
5 | GO:0008380 | RNA splicing |
6 | GO:0045892 | Negative regulation of transcription, DNA-templated |
7 | GO:0045893 | Positive regulation of transcription, DNA-templated |
8 | GO:0006810 | Transport |
9 | GO:0051260 | Protein homooligomerization |
10 | GO:0016032 | Viral process |
11 | GO:0000398 | mRNA splicing, via spliceosome |
12 | GO:0006366 | Transcription from RNA polymerase II promoter |
13 | GO:0030154 | Cell differentiation |
14 | GO:0045087 | Innate immune response |
15 | GO:0002376 | Immune system process |
16 | GO:0007165 | Signal transduction |
17 | GO:0000122 | Negative regulation of transcription from RNA polymerase II promoter |
18 | GO:0045944 | Positive regulation of transcription from RNA polymerase II promoter |
19 | GO: 0006974 | Cellular response to DNA damage stimulus |
20 | GO: 0001525 | Angiogenesis |
Discussion and conclusion
In spite of the fact that a large number of lncRNAs have been discovered over the past decades, only few of them have been functionally described in detail. Since there is lack of conservation and understanding for lncRNAs, it is hard to predict their functions. In this paper, a global network-based strategy, BiRWLGO, is proposed to massively annotate the potential functions of lncRNAs. First, we built a global heterogeneous network based on the data about gene expressions, lncRNA-protein associations, and protein-protein interactions. After that, to obtain the neighboring proteins for each lncRNA, we apply the bi-random walk algorithm on the global heterogeneous network. Finally, a specific lncRNA can be annotated with the GO terms according to its neighboring proteins. In terms of predictive performance, BiRWLGO performs well on the independent dataset lncRNA2GO-55. BiRWLGO acquires the best Fmax score of 0.345. The values of recall and precision are 0.552 and 0.251, respectively. As for coverage, there are 47 correctly-predicted lncRNAs with at least one GO term in the manually-curated 55 lncRNAs. Moreover, the experimental results show that integrating the protein-protein interaction data can improve the performance of function prediction for lncRNAs.
In the future, we can improve BiRWLGO in the following aspects. First, the gene expression data is incomplete, and the reliability is needed to be improved. Embracing more reliable expression data would contribute to the functional annotation for lncRNAs. Secondly, besides the interactions between lncRNAs and proteins, integrating more reliable interactions between lncRNAs and other molecules (e.g. microRNAs) may improve the performance of BiRWLGO. Thirdly, it is well-known that GO functions are organized as a directed acyclic graph hierarchy. Therefore, utilizing the relations among GO terms would increase the power of prediction.
Declarations
Acknowledgements
This work was supported by National Natural Science Foundation of China under grants No. 61672541, and Natural Science Foundation of Hunan Province under grant No. 2017JJ3287.
Funding
Publication costs were funded by National Natural Science Foundation of China grant No.61672541.
Availability of data and materials
The datasets used in this study are included in the supplementary files.
About this supplement
This article has been published as part of BMC Medical Genomics Volume 11 Supplement 5, 2018: Selected articles from the IEEE BIBM International Conference on Bioinformatics & Biomedicine (BIBM) 2017: medical genomics. The full contents of the supplement are available online at https://bmcmedgenomics.biomedcentral.com/articles/supplements/volume-11-supplement-5.
Authors’ contributions
JPZ designed the study and conducted experiments. SZ and JPZ drafted the manuscript. LD and JPZ performed statistical analyses. LD prepared the experimental materials and benchmarks. LD conceived the study and helped to draft the manuscript. The authors declare that they have no conflict of interest. All authors have read and approved the final manuscript.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
- Spizzo R, Almeida MI, Colombatti A, Calin GA. Long non-coding rnas and cancer: a new frontier of translational research|[quest]|. Oncogene. 2012; 31(43):4577–87.View ArticleGoogle Scholar
- Alexander RP, Fang G, Rozowsky J, Snyder M, Gerstein MB. Annotating non-coding regions of the genome. Nat Rev Genet. 2010; 11(8):559–71.View ArticleGoogle Scholar
- Yue F, Cheng Y, Breschi A, Vierstra J, Wu W, Ryba T, Sandstrom R, Ma Z, Davis C, Pope BD. A comparative encyclopedia of dna elements in the mouse genome. Nature. 2014; 515(7527):355.View ArticleGoogle Scholar
- Nam J-W, Bartel DP. Long noncoding rnas in c. elegans. Genome Res. 2012; 22(12):2529–40.View ArticleGoogle Scholar
- Morris KV, Mattick JS. The rise of regulatory rna. Nat Rev Genet. 2014; 15(6):423.View ArticleGoogle Scholar
- Hirose T, Mishima Y, Tomari Y. Elements and machinery of non-coding rnas: toward their taxonomy. Embo Reports. 2014; 15(5):489–507.View ArticleGoogle Scholar
- Turner M, Galloway A, Vigorito E. Noncoding rna and its associated proteins as regulatory elements of the immune system. Nat Immunol. 2014; 15(6):484–91.View ArticleGoogle Scholar
- Wapinski O, Chang HY. Long noncoding rnas and human disease. Trends Cell Biol. 2011; 21(6):354–61.View ArticleGoogle Scholar
- Zhang J, Zhang Z, Chen Z, Deng L. Integrating multiple heterogeneous networks for novel lncrna-disease association inference. IEEE/ACM Trans Comput Biol Bioinforma. 2017. https://doi.org/10.1109/TCBB.2017.2701379.
- Yi Z, Hui L, Fang S, Yue K, Wei W, Hao Y, Li Z, Bu D, Sun N, Zhang MQ. Noncode 2016: an informative and valuable data source of long non-coding rnas. Nucleic Acids Res. 2016; 44(Database issue):203–8.Google Scholar
- Xiu CQ, Thomson DW, Maag JLV, Bartonicek N, Signal B, Clark MB, Gloss BS, Dinger ME. lncrnadb v2.0: expanding the reference database for functional long noncoding rnas. Nucleic Acids Res. 2015; 43(Database issue):168.Google Scholar
- Zhou KR, Liu S, Sun WJ, Zheng L, Zhou H, Yang JH, Qu LH. Chipbase v2.0: decoding transcriptional regulatory networks of non-coding rnas and protein-coding genes from chip-seq data:. Nucleic Acids Res. 2017; 45(Database issue):43–50.View ArticleGoogle Scholar
- Bhartiya D, Pal K, Ghosh S, Kapoor S, Jalali S, Panwar B, Jain S, Sati S, Sengupta S, Sachidanandan C. lncrnome: a comprehensive knowledgebase of human long noncoding rnas. Database. 2013; 2013(14):034.Google Scholar
- Chen G, Wang Z, Wang D, Qiu C, Liu M, Chen X, Zhang Q, Yan G, Cui Q. Lncrnadisease: a database for long-non-coding rna-associated diseases. Nucleic Acids Res. 2013; 41(Database issue):983–6.Google Scholar
- Belinky F, Bahir I, Stelzer G, Zimmerman S, Rosen N, Nativ N, Dalah I, Iny Stein T, Rappaport N, Mituyama T. Non-redundant compendium of human ncrna genes in genecards. Bioinformatics. 2013; 29(2):255–61.View ArticleGoogle Scholar
- Miao YR, Liu W, Zhang Q, Guo AY. lncrnasnp2: an updated database of functional snps and mutations in human and mouse lncrnas:. Nucleic Acids Res. 2018; 46(Database issue):276–80.View ArticleGoogle Scholar
- Chan WL, Huang H, Chang JG. lncrnamap: a map of putative regulatory functions in the long non-coding transcriptome. Comput Biol Chem. 2014; 50:41.View ArticleGoogle Scholar
- Jiang Q, Wang J, Wu X, Ma R, Zhang T, Jin S, Han Z, Tan R, Peng J, Liu G. Lncrna2target: a database for differentially expressed genes after lncrna knockdown or overexpression. Nucleic Acids Res. 2015; 43(Database issue):193–6.View ArticleGoogle Scholar
- Yun X, Zhang J, Lei D. Prediction of lncrna-protein interactions using hetesim scores based on heterogeneous networks. Sci Rep. 2017; 7(1):3664.View ArticleGoogle Scholar
- Zhang J, Zhang Z, Wang Z, Liu Y, Deng L. Ontological function annotation of long non-coding rnas through hierarchical multi-label classification. Bioinformatics. 2018; 34(10):1750–7.View ArticleGoogle Scholar
- Guo X, Gao L, Liao Q, Xiao H, Ma X, Yang X, Luo H, Zhao G, Bu D, Jiao F. Long non-coding rnas function annotation: a global prediction method based on bi-colored networks. Nucleic Acids Res. 2013; 41(2):35.View ArticleGoogle Scholar
- Jiang Q, Ma R, Wang J, Wu X, Jin S, Peng J, Tan R, Zhang T, Li Y, Wang Y. Lncrna2function: a comprehensive resource for functional investigation of human lncrnas based on rna-seq data. BMC Genomics. 2015; 16(S3):2.View ArticleGoogle Scholar
- Zhang Z, Zhang J, Chao F, Tang Y, Lei D. Katzlgo: Large-scale prediction of lncrna functions by using the katz measure based on multiple networks. IEEE/ACM Trans Comput Biol Bioinforma. 2017. https://doi.org/10.1109/TCBB.2017.2704587.
- Luo J, Qiu X. A novel approach for predicting microrna-disease associations by unbalanced bi-random walk on heterogeneous network. J Biomed Inform. 2017; 66:194–203.View ArticleGoogle Scholar
- Xie M, Xu YJ, Zhang YG, Hwang T, Kuang R. Network-based phenome-genome association prediction by bi-random walk. PLoS ONE. 2015; 10(5):0125138.Google Scholar
- Peng W, Li M, Chen L, Wang L. Predicting protein functions by using unbalanced random walk algorithm on three biological networks. IEEE/ACM Trans Comput Biol Bioinforma. 2017; 14(2):360–9.View ArticleGoogle Scholar
- Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP. String v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015; 43(Database issue):447.View ArticleGoogle Scholar
- Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merkel A, Knowles DG, et al. The gencode v7 catalog of human long noncoding rnas: analysis of their gene structure, evolution, and expression. Genome Res. 2012; 22(9):1775–89.View ArticleGoogle Scholar
- Okamura Y, Aoki Y, Obayashi T, Shu T, Ito S, Narise T, Kinoshita K. Coxpresdb in 2015: coexpression database for animal species by dna-microarray and rnaseq-based expression data with multiple quality assessment systems. Nucleic Acids Res. 2015; 43(Database issue):82–6.View ArticleGoogle Scholar
- Parkinson H, Sarkans U, Shojatalab M, Abeygunawardena N, Contrino S, Coulson R, Farne A, Lara GG, Holloway E, Kapushesky M. Arrayexpress–a public repository for microarray gene expression data at the ebi. Nucleic Acids Res. 2003; 31(1):68–71.View ArticleGoogle Scholar
- Hao Y, Wu W, Li H, Yuan J, Luo J, Zhao Y, Chen R. Npinter v3.0: an upgraded database of noncoding rna-associated interactions. Database J Biol Databases Curation. 2016; 2016:057.Google Scholar
- Huntley R, Dimmer E, Barrell D, Binns D, Apweiler R. The gene ontology annotation (goa) database. Nat Precedings. 2009; 10:429–38.Google Scholar
- Wu X, Liu Q, Jiang R. Align human interactome with phenome to identify causative genes and networks underlying disease families. Bioinformatics. 2009; 25(1):98–104.View ArticleGoogle Scholar
- Xie M, Hwang T, Kuang R. Prioritizing disease genes by bi-random walk. In: Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Berlin Heidelberg: Springer: 2012. p. 292–303.Google Scholar
- Deng L, Chen Z. An integrated framework for functional annotation of protein structural domains. IEEE/ACM Trans Comput Biol Bioinforma. 2015; 12(4):902–13.View ArticleGoogle Scholar
- Luo H, Wang J, Li M, Luo J, Peng X, Wu FX, Pan Y. Drug repositioning based on comprehensive similarity measures and bi-random walk algorithm. Bioinformatics. 2016; 32(17):2664–2671.View ArticleGoogle Scholar
- Cabili MN, Trapnell C, Goff L, Koziol M, Tazonvega B, Regev A, Rinn JL. Integrative annotation of human large intergenic noncoding rnas reveals global properties and specific subclasses. Genes Dev. 2011; 25(18):1915.View ArticleGoogle Scholar
- Yang F, Xue X, Zheng L, Bi J, Zhou Y, Zhi K, Gu Y, Fang G. Long non-coding rna ghet1 promotes gastric carcinoma cell proliferation by increasing c-myc mrna stability. FEBS J. 2014; 281(3):802–13.View ArticleGoogle Scholar
- Huang H, Liao W, Zhu X, Liu H, Cai L. Knockdown of long noncoding rna ghet1 inhibits cell activation of gastric cancer. Biomed Pharmacother = Biomed Pharmacother. 2017; 92:562.View ArticleGoogle Scholar
- Zhang X, Lian Z, Padden C, Gerstein MB, Rozowsky J, Snyder M, Gingeras TR, Kapranov P, Weissman SM, Newburger PE. A myelopoiesis-associated regulatory intergenic noncoding rna transcript within the human hoxa cluster. Blood. 2009; 113(11):2526–34.View ArticleGoogle Scholar
- Zhang X, Weissman SM, Newburger PE. Long intergenic non-coding rna hotairm1 regulates cell cycle progression during myeloid maturation in nb4 human promyelocytic leukemia cells. RNA Biol. 2014; 11(6):777–87.View ArticleGoogle Scholar
- Wan L, Kong J, Tang J, Wu Y, Xu E, Lai M, Zhang H. Hotairm1 as a potential biomarker for diagnosis of colorectal cancer functions the role in the tumour suppressor. J Cell Mol Med. 2016; 20(11):2036.View ArticleGoogle Scholar
- Xin J, Jing L, Yue F, Wang L, Yuan Z, Yang R. Downregulation of long noncoding rna hotairm1 promotes monocyte/dendritic cell differentiation through competitively binding to endogenous mir-3960. Oncotargets Ther. 2017; 10:1307–15.View ArticleGoogle Scholar