Briefings in Bioinformatics, 2017, 1?12 doi: 10.1093/bib/bbx137 Paper Systematic review of computational methods for identifying miRNA-mediated RNA-RNA crosstalk Yongsheng Li,* Xiyun Jin,* Zishan Wang, Lili Li, Hong Chen, Xiaoyu Lin, Song Yi, Yunpeng Zhang and Juan Xu Corresponding authors: Juan Xu, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China. Tel.: �-86667543; Fax: �-86615922; E-mail: [email protected]; Yunpeng Zhang, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China. E-mail: [email protected]; Song Yi, Department of Systems Biology, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA. E-mail: [email protected] *These authors contributed equally to this work. Abstract Posttranscriptional crosstalk and communication between RNAs yield large regulatory competing endogenous RNA (ceRNA) networks via shared microRNAs (miRNAs), as well as miRNA synergistic networks. The ceRNA crosstalk represents a novel layer of gene regulation that controls both physiological and pathological processes such as development and complex diseases. The rapidly expanding catalogue of ceRNA regulation has provided evidence for exploitation as a general model to predict the ceRNAs in silico. In this article, we first reviewed the current progress of RNA-RNA crosstalk in human complex diseases. Then, the widely used computational methods for modeling ceRNA-ceRNA interaction networks are further summarized into five types: two types of global ceRNA regulation prediction methods and three types of context-specific prediction methods, which are based on miRNA-messenger RNA regulation alone, or by integrating heterogeneous data, respectively. To provide guidance in the computational prediction of ceRNA-ceRNA interactions, we finally performed a comparative study of different combinations of miRNA?target methods as well as five types of ceRNA identification methods by using literature-curated ceRNA regulation and gene perturbation. The results revealed that integration of different miRNA?target prediction methods and context-specific miRNA/gene expression profiles increased the performance for identifying ceRNA regulation. Moreover, different computational methods were complementary in identifying ceRNA regulation and captured different functional parts of similar pathways. We believe that the application of these computational techniques provides valuable functional insights into ceRNA regulation and is a crucial step for informing subsequent functional validation studies. Yongsheng Li is an associate professor in the College of Bioinformatics Science and Technology at Harbin Medical University, China and Department of Systems Biology, University of Texas MD Anderson Cancer Center, USA. His research interests focus on ncRNA regulation and bioinformatics methods development. Xiyun Jin is an MS student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research interests focus on ncRNA regulation. Zishan Wang is a PhD student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. His research interests focus on method development. Lili Li is an MS student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research interests focus on bioinformatics methods. Hong Chen is a PhD student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research interests focus on ncRNA regulation. Xiaoyu Lin is an MS student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research interests focus on computational biology. Song Yi is an associate professor in the Department of Systems Biology, University of Texas MD Anderson Cancer Center, USA. His research interests focus on computational system biology in human diseases. Yunpeng Zhang is an associate professor in the College of Bioinformatics Science and Technology at Harbin Medical University, China. His research interests focus on computational system biology and ncRNA regulation in human diseases. Juan Xu is an associate professor in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research activity focused on ncRNA regulation in complex diseases. Submitted: 27 August 2017; Received (in revised form): 19 September 2017 C The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected] V 1 2 | Li et al. Key words: miRNA?target identification; ceRNA regulation; computational methods; ensemble method; crosstalk Introduction MicroRNAs (miRNAs) are endogenous small noncoding RNAs that regulate gene expression by binding to the RNA target transcripts [1, 2]. Emerging knowledge exploring the functions of miRNAs in the regulation of gene expression has dramatically altered our view of how target genes are regulated. Our knowledge of miRNA functions has been greatly expanded by recent advances in next-generation sequencing, including genomewide miRNA?gene regulation identification , the application of RNA sequencing , the increasing availability of paired miRNA and gene expression data in various types of complex diseases (such as TCGA  and ICGC  projects), the recognition of miRNA-miRNA synergistic regulation and competing endogenous RNAs (ceRNAs) regulation mediated by miRNAs. In general, mounting evidence has indicated that 60% of protein-coding genes are regulated by miRNAs [7, 8]. Moreover, genes can also be regulated by more than one miRNA. The functional complexity of miRNAs and cooperative regulation among miRNAs challenged our ability to comprehensively understand the functions of miRNAs in complex diseases . In addition to protein-coding genes, an increasing number of other types of RNAs were found to be regulated by miRNAs (Figure 1A), such as long noncoding RNAs (lncRNAs), expressed 30 -untranslated regions (UTRs), pseudogenes and circular RNAs (circRNAs). These complex miRNA-RNA interactions formed the miRNA-RNA regulatory networks. A comprehensive understanding of miRNA functions in complex diseases will be further aided through analysis of the structure of miRNA-RNA regulatory networks [10?12]. In this review, we first reviewed the miRNA synergistic regulation and ceRNA regulation in complex diseases, and we highlighted the ceRNA regulatory network-based methods for our understanding of miRNA functions. We also summarized recent computational methods used for identification of miRNA-mediated ceRNA regulation in complex diseases. Based on literature-curated PTEN-related ceRNA regulation, we systematically compared these methods and give some directions for choosing the suitable methods. MiRNA-miRNA crosstalk in complex diseases The majority of genes are targeted by more than one miRNA. In addition, approximately two-thirds of miRNAs are encoded in polycistronic clusters . These miRNAs were cotranscribed with the cluster partners, indicating that these miRNAs were cooperative functional units, and they function collectively. Such coexpressed miRNAs have a tendency to target the same genes or regulate different genes in the same functional pathway [14, 15]. These results reinforce the miRNA-miRNA crosstalk in complex diseases. However, the application of experimental methods to identify such miRNA cooperation must address many bottlenecks, such as lengthy experimental periods, the requirement for large amounts of equipment and a high number of miRNA combinations. Thus, emerging computational methods have been proposed to identify the global or context-specific miRNA-miRNA crosstalk (Figure 1A). Recently, we reviewed these methods and found that these global methods were mainly based on genomic sequence information, chromatin interaction, miRNA coregulation and coregulation of function modules or similarity of associated diseases (phenomic similarity) [9, 16]. Given that miRNA synergistic regulation is often reprogrammed in different tissues, or different development stages even within the same tissues , contextspecific miRNA-miRNA networks may provide a better representation. Context-specific miRNA synergistic regulation was mainly based on paired miRNA and gene expression profiles . With the increase in omics data set of complex diseases, it is envisioned that our understanding of miRNA synergistic regulation will be greatly enhanced. MiRNA-mediated ceRNA-ceRNA crosstalk in complex diseases In addition to the conventional miRNA-RNA regulation, increasing studies have shown that regulation among the miRNA seed region and messenger RNA (mRNA) is not unidirectional, but that the pool of RNAs can crosstalk with each other through competing for miRNA binding [18?20]. These ceRNAs act as molecular sponges for a miRNA through their miRNA response elements (MRE), thereby regulating other target genes of the respective miRNAs. Understanding this novel type of RNA crosstalk will lead to significant insights into regulatory networks and have implications in human cancer development and other complex diseases [21, 22]. The ceRNA hypothesis has gained substantial attention and various types of RNAs, including lncRNAs, pseudogenes and circRNAs, as well as mRNAs were demonstrated to be as ceRNA molecules (Figure 1A). Pseudogene PTENP1 had also been demonstrated to regulate the expression of its cognate gene PTEN by competing miRNAs . The lncRNA (linc-MD1) being a miRNA sponge was first demonstrated in muscle differentiation . Moreover, linc-RoR was shown to function as a miRNA sponge to prevent OCT4, SOX2 and NANOG by competing miR-145 . It was also shown that pseudogene KRAS1P can function via a miRNA sponge mechanism . CircRNAs are another type of ncRNA ceRNAs found by researchers recently, and increasing circRNAs (such as CDR1as) were validated as miRNA sponge . An increasing number of studies have attempted to explore ceRNA-ceRNA regulation in specific cancer type. However, the majority of these studies focused on the properties of individual ceRNA interaction in a specific cancer type, and lack of a global view of system-level properties of ceRNA regulation across cancer types. To address these needs, we performed a systematic analysis of 5203 cancer samples from 20 cancer types to discover mRNA-related ceRNA regulation . This study highlights the conserved features shared by pan-cancer and higher similarity within cell types of similar origins. We also found a marked rewiring in the ceRNA program between various cancers and cancer subtypes , and further revealed conserved and rewired network ceRNA hubs in cancer. Moreover, Wang et al.  systematically identified 5119 functional lncRNAassociated triplets (lncACTs) through an integrated pipeline with which a comprehensive lncACT crosstalk network was constructed. In addition, Chiu et al.  introduced a method for simultaneous prediction of miRNA?target interactions and miRNA-mediated ceRNA interactions in breast cancer. Zhou et al.  also identified breast cancer-specific ceRNA networks by integration of miRNA regulation and paired miRNA/mRNA expression data. Le et al.  also summarized several computational methods for identifying ceRNA regulation in complex Identifying RNA-RNA crosstalk | 3 Figure 1. MiRNA-mediated RNA-RNA crosstalk. (A) The ceRNA model involves different types of RNAs, including coding genes, expressed 30 -UTRs, pseudogenes, lncRNAs and circRNAs. MiRNA-miRNA crosstalk and ceRNA-ceRNA crosstalk play critical roles in human diseases. MiRNA-miRNA synergistic regulation was identified by coregulation, functional and disease similarity. CeRNA-ceRNA regulation was usually identified based on two principles: regulatory and expression similarity. (B) The flowchart commonly used to identify the ceRNA regulation. First, miRNA?target interactions were determined by integration of computational methods and AGOCLIP-Seq data sets. Second, RNA pairs that coregulated by miRNAs were identified by ratio or hypergeometric test. Third, the expression similarity was evaluated based on genome-wide expression profiles. (C) The aim of this review is to evaluate the performance of different miRNA?gene interaction prediction methods and ceRNA regulation identification methods based on literature-curated ceRNAs. disease. All these studies have suggested that research into miRNA sponges is emerging, and it is an interesting and important topic for understanding miRNA functions by identifying ceRNA regulation in complex diseases. MiRNA-mediated ceRNA-ceRNA dysregulation network in complex diseases CeRNA interaction activity might change between normal and cancer samples, such as some ceRNA interactions showed competing activities in normal but not cancer samples, and vice versa. In addition, some ceRNA interactions showed competing activities in both normal and cancer context, but with opposite expression patterns. In addition to investigating ceRNA regulation in cancer, exploring the differential ceRNA regulation that were deregulated in cancer compared with normal conditions helps us to systematically understand the mechanism of cancer. Shao et al.  have proposed a computational method to systematically identify genome-wide dysregulated ceRNAceRNA interactions in lung cancer. Paci et al.  also proposed a computational approach to explore the lncRNA-mRNA ceRNA interactions in normal and breast cancer samples. Their results highlights a marked rewiring of the ceRNA interactions between normal and cancer samples, which were documented by its ?on/ off? switch. Based on mutually exclusive activation, the lncRNA PVT1 was identified as a key lncRNA in breast cancer. Motivated by this study, Zhang et al.  systematically integrated multidimensional expression profile of >5000 samples across 12 cancers to investigate the lncRNA-related ceRNA crosstalk networks in both tumor and normal physiological states. This analysis provided a comprehensive dysregulated ceRNA landscape across cancer types. Dysregulated ceRNA networks have being used for identifying clinical-related biomarkers. For instance, Wang et al.  identified glioblastoma (GBM)-related lncRNA-miRNA-mRNA triplets by a differential ceRNA network between GBM and normal tissues. In summary, these studies offered a means of examining the difference of ceRNA interactions between normal and cancer context, and provide new tools for elucidate cancer processes as well as new targets for therapy. Computational methods for identifying miRNA-mediated RNA-RNA crosstalks Given the critical roles of miRNA-mediated RNA-RNA crosstalks, they have attracted growing attentions from researchers. Several computational methods have emerged in discovering ceRNA-ceRNA interactions . These methods were mainly categorized into pair-wise correlation approach, partial association approach and mathematical modeling approach. Central to identification of miRNA-mediated ceRNA regulation is the identification of miRNA targets. A number of methods have been proposed over the past decade to identify miRNA targets through integration of sequence-based prediction, conservation, physical association and/or correlative gene expression (Figure 1B). The commonly used methods included TargetScan , miRanda , PicTar , PITA  and RNA22 . Although much is known about the miRNA regulatory principles in target recognition and some miRNA targets can be 4 | Li et al. Table 1. Summary of computational approaches for identifying miRNA-mediated RNA-RNA interaction networks Methods Statistical methods Ratio No HyperT Hypergeometric test HyperC Hypergeometric test; correlation coefficient SC Sensitive correlation coefficient; random test CMI CMI; random test Brief description Input data Genes were ranked based on the proportion of coregulating miRNAs Extract significant gene pairs by checking whether they share the similar set of miRNAs using hypergeometric cumulative distribution test Consider RNA-RNA pairs sharing a significant overlap of common miRNAs, and combine gene expression to identify the significant positively coexpressed RNA pairs Integration of miRNA, mRNA expression profiles to compute the sensitive correlation coefficient and using the random test to estimate the significance of the average SC compared with random conditions Combined paired miRNA, mRNA expression profiles and miRNA?gene regulation to estimate statistical significance of information divergence between the mutual information and CMI to identify miRNA sponge modulators P-value Global or context-specific N Global Y Global miRNA?gene regulation; gene expression Y Context-specific miRNA?gene regulation; miRNA expression; gene expression miRNA?gene regulation; miRNA expression; gene expression Y Context-specific Y Context-specific miRNA?gene regulation miRNA?gene regulation Note: Y represents this method provided the significance level and N represents this method not provided the P-values. predicted by these computational methods, much remains to be learned. Lines of evidence have indicated that there are higher false positives in the predicted target sets. Moreover, several biochemical methods have been developed to capture miRNA? target complexes on a global scale. MiRNAs and targets in the process of being regulated can be coprecipitated with Argonaute (AGO). AGO and miRNA immunoprecipitation or pulldown methodologies, such as AGO CLIP-Seq , HITS-CLIP  and PAR-CLIP , were proposed to genome wide identification of the miRNA targets. Collectively, these strategies offer a major advantage in identifying the interactions of functional miRNA? targets. However, we are lack of knowledge which method is better for identifying ceRNA-ceRNA regulation in complex diseases. After assembling the miRNA?target regulation, two commonly used principles for identifying miRNA-mediated ceRNAceRNA regulation were used [30, 45]. The central hypothesis of most computation methods is that ceRNA crosstalk increased with the high miRNA regulatory similarity between mRNAs and their strong coexpression in specific context (Figure 1B). In this section, we reviewed the widely used computational methods for identifying ceRNA regulation or miRNA sponge interactions. Here, we considered five types of methods (Table 1), including two types of global ceRNA regulation prediction methods (Figure 2A, ratio based, we termed ratio and hypergeometric test based, termed HyperT) and three types of context-specific prediction methods [Figure 2A, hypergeometric test plus coexpression, termed HyperC, sensitivity correlation-based method (SC) and conditional mutual information (CMI)-based methods]. Ratio-based prediction Based on the hypothesis that ceRNA pairs were likely to be regulated by same miRNAs, this method ranked the candidate genes by the proportion of common miRNAs (Figure 2B) . For instance, we need to identify the ceRNA partners for gene i from all candidate gene sets S, and the ratio is calculated as: R� j� � miRNAi \ miRNAj ; j 2 S: miRNAj Where miRNAi is the miRNA set that regulated gene i and miRNAj is the miRNA set that regulated gene j. Hypergeometric test-based prediction-HyperT In addition to rank the genes by the proportion of common miRNAs, it is usually used the hypergeometric to evaluate whether two genes were coregulated by miRNAs (Figure 2C). This statistic test computed the significance of common miRNAs for each RNA pairs. The probability P was calculated according to: NX P � 1 F餘XY 1jN; NX ; NY � � 1 NX XY 1 t� t ! N NX NY t ! N ! : NY Where N is the number of all miRNAs of human genome, NX and NY represent the total number of miRNAs that regulate RNA X and Y, respectively, and NXY is the number of miRNAs shared between RNA X and Y. All P-values were subject to false discovery rate (FDR) correction and RNAs were ranked based on the FDR values. Hypergeometric test combined with coexpression-based prediction-HyperC Next, to discover the active ceRNA-ceRNA regulatory pairs in a specific context, the commonly used method is to using the coexpression principle to filter the ceRNA-ceRNA regulation identified based on the above two global methods [31, 46]. This method integrated context-specific gene expression profile data sets. The Pearson correlation coefficient (R) of each candidate ceRNA regulatory pairs identified was calculated as: Identifying RNA-RNA crosstalk | 5 Figure 2. The flowchart of methods for identifying miRNA-mediated ceRNA interactions. (A) The flowchart for identifying global and context-specific ceRNA regulation. (B) The flowchart for ration-based method. (C) The flowchart for hypergeometric test-based method. (D) The flowchart for combination of hypergeometric test and coexpression method. (E) The flowchart for SC-based method. (F) The flowchart of CMI-based methods. n P 摒yi y � 饃i x R � s?????????????????????????s????????????????????????? : n n P P � �饃i x 饄i y i� i� i� Where xi and yi are the expression levels of RNA X and RNA and y are the average expression levels of Y in sample i, and x RNA X and Y across all tumor samples. In general, genes were first ranked by the P-value of hypergeometric test and the correlation coefficient, separately. The average rank of each gene was calculated to rank the candidate genes (Figure 2D). SC-based prediction Besides the genome-wide gene expression profiles, miRNA expression profiles were also integrated into the procedure to identify the ceRNA regulation in complex diseases. The common method is to identify high correlated RNA pairs in which the correlation is because of the presence of one or more miRNAs (Figure 2E). Paola et al.  had proposed SC and identified a sponge interaction network between lncRNAs and mRNAs in human breast cancer. Zhang et al.  systematically characterized the lncRNA-related ceRNA interactions across 12 major cancer types based on this method. For a candidate pair of RNA-A and RNA-B, given a co-regulated miRNA-M, the calculated formula was as follows: RAB RAM RMB RABjM � q?????????????????q????????????????? 1 R2AB 1 R2MB Where, RAB , RAM and RMB represent the Pearson correlation coefficient between RNA-A and RNA-B, RNA-A and miRNA-M, RNA-B and miRNA-M, respectively. Then, the SC of miRNA-M, which is referred as S, for the corresponding candidate ceRNA pair is calculated as: S � RAB RABjM : To identify the significant correlation, a random background distribution of the S was generated by calculating the score S of randomly selected combinations of RNA-miRNA-RNA competing interaction. CMI-based methods Quantitatively identifying direct dependencies between RNAs is an important step in identifying ceRNA interactions. CMI is widely used to identify the RNA-RNA correlations, given the value of miRNAs. Hermes is a widely used method, which predicts ceRNA interactions from expression profiles of candidate RNAs and their common miRNA regulators using CMI (Figure 2F) . This method is similar to MINDY, which a computational method based on CMI estimator to identify modulators of transcription factor activity. Both these two methods rely on the idea that CMI implies that a modulator expression M is predictive of changes in regulatory activity of a regulator R to its targets T. Specifically, Hermes used two distinct test to identify the ceRNA-ceRNA pairs. First, the size of the common miRNAs, pmiR 餞1; T2� � pmiR 餞1� \ pmiR 餞2�, is required to be statistically significant relative to the two individual miRNA size, and this is 6 | Li et al. Figure 3. Comparison of computational methods based on literature-curated ceRNAs. (A) Literature-curated PTEN-related ceRNAs. (B?F) The proportion of validated ceRNAs of PTEN in top-rank candidate ceRNAs identified by different computational methods. (B) Ratio-based. (C) Hypergeometric test. (D) Hypergeometric test and coexpression. (E) SC based. (F) CMI. Different colored lines represent different miRNA?target interaction prediction methods. performed by Fisher?s exact test. Next, for each miRNAs k, Hermes evaluates the statistical significance of the test: I絤iRk TM > I絤iRk T: Where the variables indicate the expression of the corresponding RNA species. P-values for each ceRNA-miRNA-ceRNA triplet were computed using the random test where the candidate modulator?s expression is shuffled for 1000 times. The final significance for all miRNAs is then computed by converting all the individual P-values for each miRNA k. This is based on Fisher?s method, where v2 � 2 XN k� ln餻k � Where N is the total number of miRNAs. Comparison of computational methods Although these computational methods have been demonstrated to be useful for identification of ceRNA crosstalk in complex diseases, it is difficult to evaluate the quality of these methods. In addition, we are lack of knowledge of which miRNA?targets are useful for this process. Thus, in this section, we conduct an investigative study to compare the relative performance of five representative methods mentioned above using the ceRNA regulation of PTEN (Figure 1C). Data sets PTEN-related ceRNAs. In a diverse set of human cancer types, PTEN was found to be dysregulated. An in-depth understanding of the underlying mechanisms by which PTEN expression is modulated is crucial to achieve a comprehensive knowledge of its biological roles. The competition between PTEN mRNA and other RNAs mediated by miRNAs has emerged as one such mechanism and has brought into focus. Here, we assembled a list of PTEN-related ceRNAs by curating the available literature (Figure 3A). In addition, we obtained the gene expression profiles for wild-type and PTEN overexpressed U87 cell lines. The genes were ranked based on the fold change (FC) of expression. The genes with different FCs (FC > 1.5 and FC > 2) were regarded as ceRNAs of PTEN. MiRNA?target regulation. Assembling the miRNA-regulation is the first step to identify the ceRNA-ceRNA crosstalk in complex diseases. Here, we assembled genome-wide miRNA-gene regulation by five commonly used methods, including TargetScan, miRanda, PicTar, PITA and RNA22. Recently, several studies have demonstrated that use of cross-linking and AGO immunoprecipitation coupled with high-throughput sequencing could identify endogenous genome-wide interaction maps for miRNAs [43, 48]. Thus, we also integrated the CLIP-Seq data sets that are available in starBase V2 . In addition, we considered the Ensemble miRNA?target regulation that were predicted by at least one to five computational methods. Because there is no miRNAs target PTEN in Ensemble-five, in total, nine sets of miRNA?target regulation were considered here (Table 2). Identifying RNA-RNA crosstalk | 7 Table 2. The miRNA?gene regulation for 10 computational methods integrated with CLIP-seq data Methods TargetScan miRanda PicTar PITA RNA22 Ensemble-one Ensemble-two Ensemble-three Ensemble-four Ensemble-five Regulation mRNA miRNA PTEN-miRs Gene2 Gene3 Gene5 98 829 297 873 82 983 63 096 88 480 423 975 117 441 60 038 26 803 3004 7160 12 294 6653 5805 8777 13 801 8410 5830 3864 1100 277 249 273 249 311 386 277 249 244 168 56 88 73 60 3 104 74 56 46 0 3944 9797 3750 3330 87 11 220 5163 3128 1814 0 3611 8961 3297 2752 6 10 328 4510 2617 1386 0 2655 7717 2514 1957 0 8989 3478 1861 941 0 Note: PTEN-miRs, the number of miRNAs regulated PTEN; Gene2, the number of genes that coregulated by at least two miRNAs with PTEN; Gene3, the number of genes that coregulated by at least three miRNAs with PTEN; Gene5, the number of genes that coregulated by at least five miRNAs with PTEN; Ensemble-one to Ensemble-five, the miRNA-regulation predicted by at least one to five computational methods. Gene and miRNA expression of glioma. Genome-wide miRNA and gene expression of 541 glioma samples were obtained from the TCGA project . In total, there are 12 042 genes and 470 miRNAs in the profiles. Comparison based on literature-curated PTEN-related ceRNAs We first compared the five proposed computational methodbased 29 literature-curated ceRNA regulation of PTEN (Figure 3A). In addition, eight methods (except RNA22) for identification of miRNA?gene regulation were considered in this process. To exclude the bias of the number of predictions for different methods, all candidate genes were ranked and we calculated the proportion of literature-curated PTEN ceRNAs in top-ranked genes (Figure 3B?F). We found that for all five computational methods, the number of recalled ceRNAs was different. Globally, the prediction power was higher when the ensemble miRNA?gene regulation was used. The performance was the highest when using the miRNA?gene regulation predicted by at least four methods (Figure 3B?F). This is consistent in five ceRNA prediction methods. This best performance might be explained by the fact that ensemble method can obtain more functionally enriched targets. In addition, we found that the methods that integrated expression profiles (HyperC, SC and CMI) had higher performance over the global predication ones (ratio and HyperT) to identify the ceRNA regulation. Specifically, we found that HyperC and CMI showed higher performance. Although with similar performance, we found that the HyperC is easier to understand for biological researchers, and the computational resource used by HyperC is much less than CMI. These observations indicate that integration of the miRNA and gene expression might identify the context-specific miRNA? gene regulation, which further increase the performance for identification of functional ceRNA regulation. Comparison based on overexpression of PTEN In addition to the literature-curated PTEN-related ceRNAs, we evaluated the performance of these computational methods based on a PTEN-overexpressed microarray analysis. Compared with PTEN wild-type cell line, 84% of the literature-curated ceRNAs showed increased expression when reintroduced PTEN into the cell (Figure 4A). This observation validated the coexpression principle of ceRNA regulation and also indicated that the genes with increased expression were likely to be PTEN- related ceRNAs. Thus, we next used the genes with 1.5-fold increased expression as a set of PTEN-related ceRNAs to evaluate the performance of these computational methods. By calculating the proportion of ceRNAs in top-ranked genes for each method, we found that these methods reach the best performance when using the ensemble-four-method-retrieved miRNA? gene regulation (Figure 4B?F). However, using the ensembleone-based miRNA?gene regulation, all of these methods show the poorest performance. This might be because of the high false positive of miRNA?gene regulation when using the union set of different methods. These observations suggest that it is critical to select the suitable miRNA?gene regulation for identifying the ceRNA regulation. In addition, when using miRNA? gene regulation retrieved from the individual method, we found that PicTar showed higher performance than other methods. Similarly, the computational methods integrated with miRNA or gene expression data also increased the performance for identifying the ceRNA regulation. In addition, we also compared these methods based on genes >2-fold in PTEN overexpression data. Similar results were obtained (Figure 5A?E), and suggested that it is better to identify the ceRNA regulation based on ensemble-four-method and integration of miRNA and mRNA expression profiles. Overlap of different computational methods The number of PTEN-related ceRNAs identified by the five methods is different. Next, we compared the results of the five ceRNA prediction methods based on the ensemble miRNA?gene regulation that were predicted at least four-target prediction methods. For the ratio-based and HyperC method, we obtained the top-ranked 300 genes as candidate ceRNAs of PTEN. On the other hand, we retrieved the ceRNAs with FDR <0.05 as the candidate ceRNAs for the HyperT, SC and CMI methods. We found that only a small fraction of candidate ceRNAs were shared by at least four methods, and no candidate ceRNAs were identified by all five methods (Figure 6A). Specifically, the SC method identified few ceRNA candidates and the majority of these were not covered by other methods. The ceRNA candidates identified by ratio-based and HyperC were all covered by at least one of the other methods. These observations imply that different computational methods may have their own merits. As these computational methods might capture different aspects of ceRNA regulation, we next explore whether the identified candidate ceRNAs were involved in same or similar 8 | Li et al. Figure 4. Comparison of computational methods based on gene expression perturbation. (A) Literature-curated PTEN-related ceRNAs were highly expressed in PTEN overexpressed cell line. (B?F) The proportion of validated ceRNAs of PTEN in top-rank candidate ceRNAs identified by different computational methods. (B) Ratio based. (C) Hypergeometric test. (D) Hypergeometric test and coexpression. (E) SC based. (F) CMI. Different colored lines represent different miRNA?target interaction prediction methods. This figure is based on 1.5-fold changes. biological function. As there are few candidate ceRNAs identified by SC method, we compared the enriched functions of the other four methods. Functional enrichment analysis revealed that these ceRNA candidates play critical roles in cancer, and several pathways were shared by more than two methods, such as ?pathway in cancer?, ?endocytosis? and ?MAPK signaling pathway? (Figure 6B). In addition, we also identified several biological processes shared by different methods, such as ?regulation of protein serine/threonine kinase activity? and ?regulation of cell cycle? were shared by all four methods (Figure 6C). These results indicated that these methods were complementary with each other, and it is best to integrate the results of different methods to identify the ceRNA regulation in human complex diseases. Discussions and future directions With the past decades, miRNA-mediated ceRNA crosstalk has been found to be involved in many diseases including cancer. MiRNA-miRNA crosstalk and ceRNA-ceRNA crosstalk are changing our understanding the mechanisms of cancer . In this article, we have reviewed the recent developed computational methods for identification of miRNA-mediated ceRNA interactions. Through the increasing application of high-throughput sequencing data sets, ceRNA regulation continues to be discovered. However, the gap between identified and functionally characterized ceRNA regulation remains considerably large. One of the challenges for identification of miRNA-mediated ceRNA regulation is the accuracy of the miRNA?gene regulation. Different miRNA?target prediction methods only considered a set of possible targets for miRNAs, which also included more false-positive miRNA targets. Our analysis indicated that it is better to integrate different miRNA?target prediction methods. Specifically, the ensemble-four method gets the best performance in ceRNA prediction. In addition, integration of the context miRNA-mRNA expression profiles increased the performance. This suggested that context-specific miRNA?gene regulation is useful in identifying the miRNA-mediated ceRNA crosstalk. However, this might be challenged by the small number of samples with paired miRNA and gene expression profiles, especially when we considered multiple types of RNAs. As the research into miRNA-mediated ceRNA regulation has just emerged, there is no gold standard positive ceRNA regulation to validate these proposed computational predictions. Here, based on literature-curated PTEN-related ceRNAs and PTEN-overexpression data sets, we evaluated these methods. However, our evaluations in the case study are not sufficient for drawing definite conclusions about the performances of these methods. Our comparison results suggest that these methods may have their own merits, and they capture the ceRNA candidates involved in similar functions. We suggest that it is better to use them complementarily, e.g. by combining them to develop an ensemble method . Aside from the observations that ceRNA activities were affected by the relative abundance of miRNA and RNAs, the other mechanisms that regulated ceRNA interactions remain poorly understood. Because miRNAs mainly bind to the 30 -UTR of target RNAs to perform their functions, and 30 -UTR length is observed to be highly regulated in various types of cancer . It is conceivable that ceRNA interaction could be altered in cancer. However, there are limited studies to investigate the Identifying RNA-RNA crosstalk | 9 Figure 5. Comparison of computational methods based on gene expression perturbation. (A?E) The proportion of validated ceRNAs of PTEN in top-rank candidate ceRNAs identified by different computational methods. (A) Ratio based. (B) Hypergeometric test. (C) Hypergeometric test and coexpression. (D) SC based. (E) CMI. Different colored lines represent different miRNA?target interaction prediction methods. This figure is based on 2-fold changes. 30 -UTR-mediated ceRNA rewiring in cancer. In addition, genetic variants have been observed to widely perturb miRNA regulation, which therefore further changed the dynamic equilibrium of ceRNA regulation. Recently, one study identified a large number of genetic variants that are associated with ceRNA function , which were termed as ?cerQTL?. This study suggests that another function aspect of noncoding regulatory variants. Besides the miRNA regulation, other posttranscriptional mechanisms (such as RNA editing and RNA-binding protein regulation) have been focused by recent studies. RNA editing could influence miRNA target regulation  and thus influence ceRNA interaction. RNA-binding protein (RBP) might compete for binding sites of miRNAs , which could also affect the ceRNA interactions. However, these types of regulation were not considered in current methods for identifying ceRNA interactions. Moreover, intra-tumor heterogeneity is critical for development effective methods for therapy. So far, the majority of ceRNA studies have been performed at a cell-population level. With the development of single-cell techniques , it may be able to shed light to the contribution of ceRNA regulation in tumor heterogeneity. Furthermore, although most of the ceRNA interactions identified so far are between binary RNA partners, increasing evidence has indicated that ceRNA crosstalk are formed as large interconnected networks. In addition to direct interactions through shared miRNAs, secondary indirect interactions have been shown to contribute ceRNA regulation . In addition, ceRNA regulation and transcription regulation have been shown to be tightly coupled, adding the complexity of ceRNA crosstalk . Evidence has also demonstrated that integration of protein?protein interaction information can help to understand how miRNA sponges influence the downstream biological processes . However, how to integrate these functional information into the process for identification of ceRNA interaction remain poorly understood. In summary, analysis of ceRNA interactions and crosstalk in intertwined networks may represent a robust platform for understanding miRNA regulation in human complex diseases. Here, we proposed that the application of computational techniques provides valuable functional and mechanistic insight into miRNA-mediated ceRNA regulation. There are both opportunities and challenges for developing computational methods for identification miRNA-mediated ceRNA crosstalk in future studies. Key Points ? MiRNA-mediated ceRNA regulation plays critical roles in complex diseases. ? Computational methods for identification of RNA-RNA crosstalk were reviewed. ? Integration of different miRNA target identification methods and context-specific expression facilitates identification of ceRNA regulation. ? Different computational methods are complementary for identifying ceRNAs involved in similar biological functions. 10 | Li et al. Figure 6. Overlap of candidate ceRNAs identified by different computational methods. (A) The overlap of ceRNAs for different methods. (B) and (C) The pathways and biological processes enriched by candidate ceRNAs identified by ratio-based, hypergeometric test, hypergeometric test and coexpression and CMI methods. The size of the circles is corresponding to the proportion of candidate ceRNAs, and the colors represent different P-values. Fundings The National Natural Science Foundation of China (grant numbers 31571331 and 61502126), the China Postdoctoral Science Foundation (grant numbers 2016T90309, 2015M571436 and LBHZ14134), Natural Science Foundation of Heilongjiang Province (grant number QC2015020), Weihan Yu Youth Science Fund Project of Harbin Medical University, Harbin Special Funds of Innovative Talents on Science and Technology Research Project (grant number 2015RAQXJ091). References 1. Esquela-Kerscher A, Slack FJ. Oncomirs?microRNAs with a role in cancer. Nat Rev Cancer 2006;6:259?69. 2. Pasquinelli AE. MicroRNAs and their targets: recognition, regulation and an emerging reciprocal relationship. Nat Rev Genet 2012;13:271?82. 3. Jonas S, Izaurralde E. Towards a molecular understanding of microRNA-mediated gene silencing. Nat Rev Genet 2015;16: 421?33. 4. Ozsolak F, Milos PM. RNA sequencing: advances, challenges and opportunities. Nat Rev Genet 2011;12:87?98. 5. Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, et al. The cancer genome Atlas Pan-Cancer analysis project. Nat Genet 2013;45:1113?20. 6. Zhang J, Baran J, Cros A, et al. International cancer genome consortium data portal?a one-stop shop for cancer genomics data. Database 2011;2011:bar026. 7. Friedman RC, Farh KK, Burge CB, et al. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res 2009; 19:92?105. 8. Bajan S, Hutvagner G. Regulation of miRNA processing and miRNA mediated gene repression in cancer. Microrna 2014;3: 10?17. 9. Xu J, Li CX, Li YS, et al. MiRNA-miRNA synergistic network: construction via co-regulating functional modules and disease miRNA topological features. Nucleic Acids Res 2011;39: 825?36. 10. Bracken CP, Scott HS, Goodall GJ. A network-biology perspective of microRNA function and dysfunction in cancer. Nat Rev Genet 2016;17:719?32. 11. Li Y, Xu J, Chen H, et al. Comprehensive analysis of the functional microRNA-mRNA regulatory network identifies miRNA signatures associated with glioma malignant progression. Nucleic Acids Res 2013;41:e203. Identifying RNA-RNA crosstalk 12. Gosline SJ, Gurtan AM, JnBaptiste CK, et al. Elucidating MicroRNA regulatory networks using transcriptional, posttranscriptional, and histone modification measurements. Cell Rep 2016;14:310?19. 13. Olena AF, Patton JG. Genomic organization of microRNAs. J Cell Physiol 2010;222:540?5. 14. Wang Y, Luo J, Zhang H, et al. microRNAs in the same clusters evolve to coordinately regulate functionally related genes. Mol Biol Evol 2016;33:2232?47. 15. Li Y, Li S, Chen J, et al. Comparative epigenetic analyses reveal distinct patterns of oncogenic pathways activation in breast cancer subtypes. Hum Mol Genet 2014;23:5378?93. 16. Xu J, Shao T, Ding N, et al. miRNA-miRNA crosstalk: from genomics to phenomics. Brief Bioinform 2016, pii: bbw073. [Epub ahead of print]. 17. Meng X, Wang J, Yuan C, et al. CancerNet: a database for decoding multilevel molecular interactions across diverse cancer types. Oncogenesis 2015;4:e177. 18. Karreth FA, Pandolfi PP. ceRNA cross-talk in cancer: when cebling rivalries go awry. Cancer Discov 2013;3:1113?21. 19. Wang Y, Hou J, He D, et al. The emerging function and mechanism of ceRNAs in cancer. Trends Genet 2016;32:211?24. 20. Salmena L, Poliseno L, Tay Y, et al. A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell 2011;146:353?8. 21. Chen J, Xu J, Li Y, et al. Competing endogenous RNA network analysis identifies critical genes among the different breast cancer subtypes. Oncotarget 2017;8:10171?84. 22. Xu J, Feng L, Han Z, et al. Extensive ceRNA-ceRNA interaction networks mediated by miRNAs regulate development in multiple rhesus tissues. Nucleic Acids Res 2016;44:9438?51. 23. Yu G, Yao W, Gumireddy K, et al. Pseudogene PTENP1 functions as a competing endogenous RNA to suppress clear-cell renal cell carcinoma progression. Mol Cancer Ther 2014;13: 3086?97. 24. Cesana M, Cacchiarelli D, Legnini I, et al. A long noncoding RNA controls muscle differentiation by functioning as a competing endogenous RNA. Cell 2011;147:358?69. 25. Fu Z, Li G, Li Z, et al. Endogenous miRNA Sponge LincRNAROR promotes proliferation, invasion and stem cell-like phenotype of pancreatic cancer cells. Cell Death Discov 2017;3: 17004. 26. Qi X, Zhang DH, Wu N, et al. ceRNA in cancer: possible functions and clinical implications. J Med Genet 2015;52:710?18. 27. Tay Y, Rinn J, Pandolfi PP. The multilayered complexity of ceRNA crosstalk and competition. Nature 2014;505:344?52. 28. Xu J, Li Y, Lu J, et al. The mRNA related ceRNA-ceRNA landscape and significance across 20 major cancer types. Nucleic Acids Res 2015;43:8169?82. 29. Wang P, Ning S, Zhang Y, et al. Identification of lncRNAassociated competing triplets reveals global patterns and prognostic markers for cancer. Nucleic Acids Res 2015;43: 3478?89. 30. Chiu HS, Llobet-Navas D, Yang X, et al. Cupid: simultaneous reconstruction of microRNA-target and ceRNA networks. Genome Res 2015;25:257?67. 31. Zhou X, Liu J, Wang W. Construction and investigation of breast-cancer-specific ceRNA network based on the mRNA and miRNA expression data. IET Syst Biol 2014;8:96?103. 32. Le TD, Zhang J, Liu L, et al. Computational methods for identifying miRNA sponge interactions. Brief Bioinform 2017;18: 577?90. 33. Shao T, Wu A, Chen J, et al. Identification of module biomarkers from the dysregulated ceRNA-ceRNA interaction | 11 network in lung adenocarcinoma. Mol Biosyst 2015;11: 3048?58. 34. Paci P, Colombo T, Farina L. Computational analysis identifies a sponge interaction network between long non-coding RNAs and messenger RNAs in human breast cancer. BMC Syst Biol 2014;8:83. 35. Zhang Y, Xu Y, Feng L, et al. Comprehensive characterization of lncRNA-mRNA related ceRNA network across 12 major cancers. Oncotarget 2016;7:64148?67. 36. Wang JB, Liu FH, Chen JH, et al. Identifying survivalassociated modules from the dysregulated triplet network in glioblastoma multiforme. J Cancer Res Clin Oncol 2017;143: 661?71. 37. Agarwal V, Bell GW, Nam JW, et al. Predicting effective microRNA target sites in mammalian mRNAs. eLife 2015;4: e05005. 38. Betel D, Koppal A, Agius P, et al. Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites. Genome Biol 2010;11:R90. 39. Krek A, Grun D, Poy MN, et al. Combinatorial microRNA target predictions. Nat Genet 2005;37:495?500. 40. Kertesz M, Iovino N, Unnerstall U, et al. The role of site accessibility in microRNA target recognition. Nat Genet 2007;39: 1278?84. 41. Miranda KC, Huynh T, Tay Y, et al. A pattern-based method for the identification of MicroRNA binding sites and their corresponding heteroduplexes. Cell 2006;126:1203?17. 42. Clark PM, Loher P, Quann K, et al. Argonaute CLIP-seq reveals miRNA targetome diversity across tissue types. Sci Rep 2014;4: 5947. 43. Moore MJ, Zhang C, Gantman EC, et al. Mapping Argonaute and conventional RNA-binding protein interactions with RNA at single-nucleotide resolution using HITS-CLIP and CIMS analysis. Nat Protoc 2014;9:263?93. 44. Friedersdorf MB, Keene JD. Advancing the functional utility of PAR-CLIP by quantifying background binding to mRNAs and lncRNAs. Genome Biol 2014;15:R2. 45. Ala U, Karreth FA, Bosia C, et al. Integrated transcriptional and competitive endogenous RNA networks are crossregulated in permissive molecular environments. Proc Natl Acad Sci USA 2013;110:7154?9. 46. Chiu YC, Hsiao TH, Chen Y, et al. Parameter optimization for constructing competing endogenous RNA regulatory network in glioblastoma multiforme and other cancers. BMC Genomics 2015;16(Suppl 4):S1. 47. Sumazin P, Yang X, Chiu HS, et al. An extensive microRNAmediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma. Cell 2011;147: 370?81. 48. Yang JH, Li JH, Shao P, et al. starBase: a database for exploring microRNA-mRNA interaction maps from Argonaute CLIPSeq and Degradome-seq data. Nucleic Acids Res 2011;39: D202?9. 49. Li JH, Liu S, Zhou H, et al. starBase v2.0: decoding miRNAceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res 2014; 42:D92?7. 50. Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 2008;455:1061?8. 51. Marbach D, Costello JC, Kuffner R, et al. Wisdom of crowds for robust gene network inference. Nat Methods 2012;9:796?804. 12 | Li et al. 52. Mayr C, Bartel DP. Widespread shortening of 3?UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 2009;138:673?84. 53. Li MJ, Zhang J, Liang Q, et al. Exploring genetic associations with ceRNA regulation in the human genome. Nucleic Acids Res 2017;45:5653?65. 54. Wang Y, Xu X, Yu S, et al. Systematic characterization of A-toI RNA editing hotspots in microRNAs across human cancers. Genome Res 2017;27:1112?25. 55. Treiber T, Treiber N, Plessmann U, et al. A compendium of RNA-binding proteins that regulate MicroRNA biogenesis. Mol Cell 2017;66:270?84.e213. 56. Ramskold D, Luo S, Wang YC, et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat Biotechnol 2012;30:777?82. 57. Zhang J, Le TD, Liu L, et al. Inferring miRNA sponge coregulation of protein-protein interactions in human breast cancer. BMC Bioinformatics 2017;18:243.