close

Вход

Забыли?

вход по аккаунту

?

bib%2Fbbx137

код для вставкиСкачать
Briefings in Bioinformatics, 2017, 1?12
doi: 10.1093/bib/bbx137
Paper
Systematic review of computational methods for
identifying miRNA-mediated RNA-RNA crosstalk
Yongsheng Li,* Xiyun Jin,* Zishan Wang, Lili Li, Hong Chen, Xiaoyu Lin,
Song Yi, Yunpeng Zhang and Juan Xu
Corresponding authors: Juan Xu, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China.
Tel.: �-86667543; Fax: �-86615922; E-mail: xujuanbiocc@ems.hrbmu.edu.cn; Yunpeng Zhang, College of Bioinformatics Science and Technology,
Harbin Medical University, Harbin, Heilongjiang 150086, China. E-mail: zyp19871208@126.com; Song Yi, Department of Systems Biology, University of
Texas MD Anderson Cancer Center, Houston, Texas 77030, USA. E-mail: SYi2@mdanderson.org
*These authors contributed equally to this work.
Abstract
Posttranscriptional crosstalk and communication between RNAs yield large regulatory competing endogenous RNA (ceRNA)
networks via shared microRNAs (miRNAs), as well as miRNA synergistic networks. The ceRNA crosstalk represents a novel
layer of gene regulation that controls both physiological and pathological processes such as development and complex
diseases. The rapidly expanding catalogue of ceRNA regulation has provided evidence for exploitation as a general model to
predict the ceRNAs in silico. In this article, we first reviewed the current progress of RNA-RNA crosstalk in human complex
diseases. Then, the widely used computational methods for modeling ceRNA-ceRNA interaction networks are further
summarized into five types: two types of global ceRNA regulation prediction methods and three types of context-specific
prediction methods, which are based on miRNA-messenger RNA regulation alone, or by integrating heterogeneous data,
respectively. To provide guidance in the computational prediction of ceRNA-ceRNA interactions, we finally performed a
comparative study of different combinations of miRNA?target methods as well as five types of ceRNA identification
methods by using literature-curated ceRNA regulation and gene perturbation. The results revealed that integration
of different miRNA?target prediction methods and context-specific miRNA/gene expression profiles increased the
performance for identifying ceRNA regulation. Moreover, different computational methods were complementary in
identifying ceRNA regulation and captured different functional parts of similar pathways. We believe that the application of
these computational techniques provides valuable functional insights into ceRNA regulation and is a crucial step for
informing subsequent functional validation studies.
Yongsheng Li is an associate professor in the College of Bioinformatics Science and Technology at Harbin Medical University, China and Department of Systems
Biology, University of Texas MD Anderson Cancer Center, USA. His research interests focus on ncRNA regulation and bioinformatics methods development.
Xiyun Jin is an MS student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research interests focus on
ncRNA regulation.
Zishan Wang is a PhD student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. His research interests focus
on method development.
Lili Li is an MS student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research interests focus on bioinformatics methods.
Hong Chen is a PhD student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research interests focus on
ncRNA regulation.
Xiaoyu Lin is an MS student in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research interests focus on
computational biology.
Song Yi is an associate professor in the Department of Systems Biology, University of Texas MD Anderson Cancer Center, USA. His research interests focus
on computational system biology in human diseases.
Yunpeng Zhang is an associate professor in the College of Bioinformatics Science and Technology at Harbin Medical University, China. His research interests focus on computational system biology and ncRNA regulation in human diseases.
Juan Xu is an associate professor in the College of Bioinformatics Science and Technology at Harbin Medical University, China. Her research activity
focused on ncRNA regulation in complex diseases.
Submitted: 27 August 2017; Received (in revised form): 19 September 2017
C The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
V
1
2
| Li et al.
Key words: miRNA?target identification; ceRNA regulation; computational methods; ensemble method; crosstalk
Introduction
MicroRNAs (miRNAs) are endogenous small noncoding RNAs
that regulate gene expression by binding to the RNA target transcripts [1, 2]. Emerging knowledge exploring the functions of
miRNAs in the regulation of gene expression has dramatically
altered our view of how target genes are regulated. Our knowledge of miRNA functions has been greatly expanded by recent
advances in next-generation sequencing, including genomewide miRNA?gene regulation identification [3], the application
of RNA sequencing [4], the increasing availability of paired
miRNA and gene expression data in various types of complex
diseases (such as TCGA [5] and ICGC [6] projects), the recognition of miRNA-miRNA synergistic regulation and competing
endogenous RNAs (ceRNAs) regulation mediated by miRNAs.
In general, mounting evidence has indicated that 60% of
protein-coding genes are regulated by miRNAs [7, 8]. Moreover,
genes can also be regulated by more than one miRNA. The functional complexity of miRNAs and cooperative regulation among
miRNAs challenged our ability to comprehensively understand
the functions of miRNAs in complex diseases [9]. In addition to
protein-coding genes, an increasing number of other types of
RNAs were found to be regulated by miRNAs (Figure 1A), such
as long noncoding RNAs (lncRNAs), expressed 30 -untranslated
regions (UTRs), pseudogenes and circular RNAs (circRNAs). These
complex miRNA-RNA interactions formed the miRNA-RNA regulatory networks. A comprehensive understanding of miRNA functions in complex diseases will be further aided through analysis
of the structure of miRNA-RNA regulatory networks [10?12].
In this review, we first reviewed the miRNA synergistic regulation and ceRNA regulation in complex diseases, and we highlighted the ceRNA regulatory network-based methods for our
understanding of miRNA functions. We also summarized
recent computational methods used for identification of
miRNA-mediated ceRNA regulation in complex diseases. Based
on literature-curated PTEN-related ceRNA regulation, we systematically compared these methods and give some directions
for choosing the suitable methods.
MiRNA-miRNA crosstalk in complex diseases
The majority of genes are targeted by more than one miRNA. In
addition, approximately two-thirds of miRNAs are encoded in
polycistronic clusters [13]. These miRNAs were cotranscribed
with the cluster partners, indicating that these miRNAs were
cooperative functional units, and they function collectively.
Such coexpressed miRNAs have a tendency to target the same
genes or regulate different genes in the same functional pathway [14, 15]. These results reinforce the miRNA-miRNA crosstalk in complex diseases. However, the application of
experimental methods to identify such miRNA cooperation
must address many bottlenecks, such as lengthy experimental
periods, the requirement for large amounts of equipment and a
high number of miRNA combinations. Thus, emerging computational methods have been proposed to identify the global or
context-specific miRNA-miRNA crosstalk (Figure 1A). Recently,
we reviewed these methods and found that these global methods were mainly based on genomic sequence information, chromatin interaction, miRNA coregulation and coregulation of
function modules or similarity of associated diseases
(phenomic similarity) [9, 16]. Given that miRNA synergistic regulation is often reprogrammed in different tissues, or different
development stages even within the same tissues [11], contextspecific miRNA-miRNA networks may provide a better representation. Context-specific miRNA synergistic regulation was
mainly based on paired miRNA and gene expression profiles
[17]. With the increase in omics data set of complex diseases, it
is envisioned that our understanding of miRNA synergistic regulation will be greatly enhanced.
MiRNA-mediated ceRNA-ceRNA crosstalk in
complex diseases
In addition to the conventional miRNA-RNA regulation, increasing studies have shown that regulation among the miRNA seed
region and messenger RNA (mRNA) is not unidirectional, but
that the pool of RNAs can crosstalk with each other through
competing for miRNA binding [18?20]. These ceRNAs act as
molecular sponges for a miRNA through their miRNA response
elements (MRE), thereby regulating other target genes of the
respective miRNAs. Understanding this novel type of RNA
crosstalk will lead to significant insights into regulatory networks and have implications in human cancer development
and other complex diseases [21, 22].
The ceRNA hypothesis has gained substantial attention and
various types of RNAs, including lncRNAs, pseudogenes and
circRNAs, as well as mRNAs were demonstrated to be as ceRNA
molecules (Figure 1A). Pseudogene PTENP1 had also been demonstrated to regulate the expression of its cognate gene PTEN by
competing miRNAs [23]. The lncRNA (linc-MD1) being a miRNA
sponge was first demonstrated in muscle differentiation [24].
Moreover, linc-RoR was shown to function as a miRNA sponge to
prevent OCT4, SOX2 and NANOG by competing miR-145 [25]. It
was also shown that pseudogene KRAS1P can function via a
miRNA sponge mechanism [26]. CircRNAs are another type of
ncRNA ceRNAs found by researchers recently, and increasing
circRNAs (such as CDR1as) were validated as miRNA sponge [27].
An increasing number of studies have attempted to explore
ceRNA-ceRNA regulation in specific cancer type. However, the
majority of these studies focused on the properties of individual
ceRNA interaction in a specific cancer type, and lack of a global
view of system-level properties of ceRNA regulation across cancer types. To address these needs, we performed a systematic
analysis of 5203 cancer samples from 20 cancer types to discover mRNA-related ceRNA regulation [28]. This study highlights the conserved features shared by pan-cancer and higher
similarity within cell types of similar origins. We also found a
marked rewiring in the ceRNA program between various cancers and cancer subtypes [21], and further revealed conserved
and rewired network ceRNA hubs in cancer. Moreover, Wang
et al. [29] systematically identified 5119 functional lncRNAassociated triplets (lncACTs) through an integrated pipeline
with which a comprehensive lncACT crosstalk network was
constructed. In addition, Chiu et al. [30] introduced a method for
simultaneous prediction of miRNA?target interactions and
miRNA-mediated ceRNA interactions in breast cancer. Zhou
et al. [31] also identified breast cancer-specific ceRNA networks
by integration of miRNA regulation and paired miRNA/mRNA
expression data. Le et al. [32] also summarized several computational methods for identifying ceRNA regulation in complex
Identifying RNA-RNA crosstalk
|
3
Figure 1. MiRNA-mediated RNA-RNA crosstalk. (A) The ceRNA model involves different types of RNAs, including coding genes, expressed 30 -UTRs, pseudogenes,
lncRNAs and circRNAs. MiRNA-miRNA crosstalk and ceRNA-ceRNA crosstalk play critical roles in human diseases. MiRNA-miRNA synergistic regulation was identified
by coregulation, functional and disease similarity. CeRNA-ceRNA regulation was usually identified based on two principles: regulatory and expression similarity. (B)
The flowchart commonly used to identify the ceRNA regulation. First, miRNA?target interactions were determined by integration of computational methods and AGOCLIP-Seq data sets. Second, RNA pairs that coregulated by miRNAs were identified by ratio or hypergeometric test. Third, the expression similarity was evaluated based
on genome-wide expression profiles. (C) The aim of this review is to evaluate the performance of different miRNA?gene interaction prediction methods and ceRNA regulation identification methods based on literature-curated ceRNAs.
disease. All these studies have suggested that research into
miRNA sponges is emerging, and it is an interesting and important topic for understanding miRNA functions by identifying
ceRNA regulation in complex diseases.
MiRNA-mediated ceRNA-ceRNA dysregulation
network in complex diseases
CeRNA interaction activity might change between normal and
cancer samples, such as some ceRNA interactions showed competing activities in normal but not cancer samples, and vice
versa. In addition, some ceRNA interactions showed competing
activities in both normal and cancer context, but with opposite
expression patterns. In addition to investigating ceRNA regulation in cancer, exploring the differential ceRNA regulation that
were deregulated in cancer compared with normal conditions
helps us to systematically understand the mechanism of cancer. Shao et al. [33] have proposed a computational method to
systematically identify genome-wide dysregulated ceRNAceRNA interactions in lung cancer. Paci et al. [34] also proposed
a computational approach to explore the lncRNA-mRNA ceRNA
interactions in normal and breast cancer samples. Their results
highlights a marked rewiring of the ceRNA interactions between
normal and cancer samples, which were documented by its ?on/
off? switch. Based on mutually exclusive activation, the lncRNA
PVT1 was identified as a key lncRNA in breast cancer. Motivated
by this study, Zhang et al. [35] systematically integrated multidimensional expression profile of >5000 samples across 12 cancers to investigate the lncRNA-related ceRNA crosstalk
networks in both tumor and normal physiological states. This
analysis provided a comprehensive dysregulated ceRNA landscape across cancer types. Dysregulated ceRNA networks have
being used for identifying clinical-related biomarkers. For
instance, Wang et al. [36] identified glioblastoma (GBM)-related
lncRNA-miRNA-mRNA triplets by a differential ceRNA network
between GBM and normal tissues. In summary, these studies
offered a means of examining the difference of ceRNA interactions between normal and cancer context, and provide new tools
for elucidate cancer processes as well as new targets for therapy.
Computational methods for identifying
miRNA-mediated RNA-RNA crosstalks
Given the critical roles of miRNA-mediated RNA-RNA crosstalks, they have attracted growing attentions from researchers.
Several computational methods have emerged in discovering
ceRNA-ceRNA interactions [32]. These methods were mainly
categorized into pair-wise correlation approach, partial association approach and mathematical modeling approach. Central to
identification of miRNA-mediated ceRNA regulation is the identification of miRNA targets. A number of methods have been
proposed over the past decade to identify miRNA targets
through integration of sequence-based prediction, conservation, physical association and/or correlative gene expression
(Figure 1B). The commonly used methods included TargetScan
[37], miRanda [38], PicTar [39], PITA [40] and RNA22 [41].
Although much is known about the miRNA regulatory principles in target recognition and some miRNA targets can be
4
| Li et al.
Table 1. Summary of computational approaches for identifying miRNA-mediated RNA-RNA interaction networks
Methods
Statistical
methods
Ratio
No
HyperT
Hypergeometric
test
HyperC
Hypergeometric
test; correlation coefficient
SC
Sensitive correlation coefficient; random
test
CMI
CMI; random test
Brief
description
Input data
Genes were ranked based on the proportion of coregulating miRNAs
Extract significant gene pairs by checking whether they
share the similar set of miRNAs using hypergeometric
cumulative distribution test
Consider RNA-RNA pairs sharing a significant overlap of
common miRNAs, and combine gene expression to
identify the significant positively coexpressed RNA
pairs
Integration of miRNA, mRNA expression profiles to
compute the sensitive correlation coefficient and
using the random test to estimate the significance of
the average SC compared with random conditions
Combined paired miRNA, mRNA expression profiles and
miRNA?gene regulation to estimate statistical significance of information divergence between the mutual
information and CMI to identify miRNA sponge
modulators
P-value
Global or
context-specific
N
Global
Y
Global
miRNA?gene regulation; gene
expression
Y
Context-specific
miRNA?gene regulation;
miRNA expression; gene
expression
miRNA?gene regulation;
miRNA expression; gene
expression
Y
Context-specific
Y
Context-specific
miRNA?gene
regulation
miRNA?gene
regulation
Note: Y represents this method provided the significance level and N represents this method not provided the P-values.
predicted by these computational methods, much remains to be
learned. Lines of evidence have indicated that there are higher
false positives in the predicted target sets. Moreover, several
biochemical methods have been developed to capture miRNA?
target complexes on a global scale. MiRNAs and targets in the
process of being regulated can be coprecipitated with Argonaute
(AGO). AGO and miRNA immunoprecipitation or pulldown
methodologies, such as AGO CLIP-Seq [42], HITS-CLIP [43] and
PAR-CLIP [44], were proposed to genome wide identification of
the miRNA targets. Collectively, these strategies offer a major
advantage in identifying the interactions of functional miRNA?
targets. However, we are lack of knowledge which method is
better for identifying ceRNA-ceRNA regulation in complex
diseases.
After assembling the miRNA?target regulation, two commonly used principles for identifying miRNA-mediated ceRNAceRNA regulation were used [30, 45]. The central hypothesis of
most computation methods is that ceRNA crosstalk increased
with the high miRNA regulatory similarity between mRNAs and
their strong coexpression in specific context (Figure 1B). In this
section, we reviewed the widely used computational methods
for identifying ceRNA regulation or miRNA sponge interactions.
Here, we considered five types of methods (Table 1), including
two types of global ceRNA regulation prediction methods
(Figure 2A, ratio based, we termed ratio and hypergeometric test
based, termed HyperT) and three types of context-specific prediction methods [Figure 2A, hypergeometric test plus coexpression, termed HyperC, sensitivity correlation-based method (SC)
and conditional mutual information (CMI)-based methods].
Ratio-based prediction
Based on the hypothesis that ceRNA pairs were likely to be regulated by same miRNAs, this method ranked the candidate genes
by the proportion of common miRNAs (Figure 2B) [22]. For
instance, we need to identify the ceRNA partners for gene i from
all candidate gene sets S, and the ratio is calculated as:
R� j� �
miRNAi \ miRNAj
; j 2 S:
miRNAj
Where miRNAi is the miRNA set that regulated gene i and
miRNAj is the miRNA set that regulated gene j.
Hypergeometric test-based prediction-HyperT
In addition to rank the genes by the proportion of common
miRNAs, it is usually used the hypergeometric to evaluate
whether two genes were coregulated by miRNAs (Figure 2C).
This statistic test computed the significance of common
miRNAs for each RNA pairs. The probability P was calculated
according to:
NX
P � 1 F餘XY 1jN; NX ; NY � � 1 NX
XY 1
t�
t
!
N NX
NY t
!
N
!
:
NY
Where N is the number of all miRNAs of human genome, NX
and NY represent the total number of miRNAs that regulate RNA
X and Y, respectively, and NXY is the number of miRNAs shared
between RNA X and Y. All P-values were subject to false discovery rate (FDR) correction and RNAs were ranked based on the
FDR values.
Hypergeometric test combined with coexpression-based
prediction-HyperC
Next, to discover the active ceRNA-ceRNA regulatory pairs in a
specific context, the commonly used method is to using the
coexpression principle to filter the ceRNA-ceRNA regulation
identified based on the above two global methods [31, 46]. This
method integrated context-specific gene expression profile data
sets. The Pearson correlation coefficient (R) of each candidate
ceRNA regulatory pairs identified was calculated as:
Identifying RNA-RNA crosstalk
|
5
Figure 2. The flowchart of methods for identifying miRNA-mediated ceRNA interactions. (A) The flowchart for identifying global and context-specific ceRNA regulation.
(B) The flowchart for ration-based method. (C) The flowchart for hypergeometric test-based method. (D) The flowchart for combination of hypergeometric test and
coexpression method. (E) The flowchart for SC-based method. (F) The flowchart of CMI-based methods.
n
P
摒yi y
�
饃i x
R � s?????????????????????????s????????????????????????? :
n
n
P
P
� �饃i x
饄i y
i�
i�
i�
Where xi and yi are the expression levels of RNA X and RNA
and y
are the average expression levels of
Y in sample i, and x
RNA X and Y across all tumor samples. In general, genes were
first ranked by the P-value of hypergeometric test and the correlation coefficient, separately. The average rank of each gene
was calculated to rank the candidate genes (Figure 2D).
SC-based prediction
Besides the genome-wide gene expression profiles, miRNA
expression profiles were also integrated into the procedure to
identify the ceRNA regulation in complex diseases. The common method is to identify high correlated RNA pairs in which
the correlation is because of the presence of one or more
miRNAs (Figure 2E). Paola et al. [34] had proposed SC and identified a sponge interaction network between lncRNAs and
mRNAs in human breast cancer. Zhang et al. [35] systematically
characterized the lncRNA-related ceRNA interactions across 12
major cancer types based on this method. For a candidate pair
of RNA-A and RNA-B, given a co-regulated miRNA-M, the calculated formula was as follows:
RAB RAM RMB
RABjM � q?????????????????q?????????????????
1 R2AB 1 R2MB
Where, RAB , RAM and RMB represent the Pearson correlation
coefficient between RNA-A and RNA-B, RNA-A and miRNA-M,
RNA-B and miRNA-M, respectively. Then, the SC of miRNA-M,
which is referred as S, for the corresponding candidate ceRNA
pair is calculated as:
S � RAB RABjM :
To identify the significant correlation, a random background
distribution of the S was generated by calculating the score S of
randomly selected combinations of RNA-miRNA-RNA competing interaction.
CMI-based methods
Quantitatively identifying direct dependencies between RNAs is
an important step in identifying ceRNA interactions. CMI is
widely used to identify the RNA-RNA correlations, given the
value of miRNAs. Hermes is a widely used method, which predicts ceRNA interactions from expression profiles of candidate
RNAs and their common miRNA regulators using CMI
(Figure 2F) [47]. This method is similar to MINDY, which a computational method based on CMI estimator to identify modulators of transcription factor activity. Both these two methods rely
on the idea that CMI implies that a modulator expression M is
predictive of changes in regulatory activity of a regulator R to its
targets T. Specifically, Hermes used two distinct test to identify
the ceRNA-ceRNA pairs. First, the size of the common miRNAs,
pmiR 餞1; T2� � pmiR 餞1� \ pmiR 餞2�, is required to be statistically
significant relative to the two individual miRNA size, and this is
6
| Li et al.
Figure 3. Comparison of computational methods based on literature-curated ceRNAs. (A) Literature-curated PTEN-related ceRNAs. (B?F) The proportion of validated
ceRNAs of PTEN in top-rank candidate ceRNAs identified by different computational methods. (B) Ratio-based. (C) Hypergeometric test. (D) Hypergeometric test and
coexpression. (E) SC based. (F) CMI. Different colored lines represent different miRNA?target interaction prediction methods.
performed by Fisher?s exact test. Next, for each miRNAs k,
Hermes evaluates the statistical significance of the test:
I絤iRk TM > I絤iRk T:
Where the variables indicate the expression of the corresponding RNA species. P-values for each ceRNA-miRNA-ceRNA
triplet were computed using the random test where the candidate modulator?s expression is shuffled for 1000 times. The final
significance for all miRNAs is then computed by converting all
the individual P-values for each miRNA k. This is based on
Fisher?s method, where
v2 � 2
XN
k�
ln餻k �
Where N is the total number of miRNAs.
Comparison of computational methods
Although these computational methods have been demonstrated to be useful for identification of ceRNA crosstalk in complex diseases, it is difficult to evaluate the quality of these
methods. In addition, we are lack of knowledge of which
miRNA?targets are useful for this process. Thus, in this section,
we conduct an investigative study to compare the relative performance of five representative methods mentioned above
using the ceRNA regulation of PTEN (Figure 1C).
Data sets
PTEN-related ceRNAs. In a diverse set of human cancer types,
PTEN was found to be dysregulated. An in-depth understanding
of the underlying mechanisms by which PTEN expression is
modulated is crucial to achieve a comprehensive knowledge of
its biological roles. The competition between PTEN mRNA and
other RNAs mediated by miRNAs has emerged as one such
mechanism and has brought into focus. Here, we assembled a
list of PTEN-related ceRNAs by curating the available literature
(Figure 3A). In addition, we obtained the gene expression profiles for wild-type and PTEN overexpressed U87 cell lines. The
genes were ranked based on the fold change (FC) of expression.
The genes with different FCs (FC > 1.5 and FC > 2) were regarded
as ceRNAs of PTEN.
MiRNA?target regulation. Assembling the miRNA-regulation is
the first step to identify the ceRNA-ceRNA crosstalk in complex
diseases. Here, we assembled genome-wide miRNA-gene regulation by five commonly used methods, including TargetScan,
miRanda, PicTar, PITA and RNA22. Recently, several studies
have demonstrated that use of cross-linking and AGO immunoprecipitation coupled with high-throughput sequencing
could identify endogenous genome-wide interaction maps for
miRNAs [43, 48]. Thus, we also integrated the CLIP-Seq data sets
that are available in starBase V2 [49]. In addition, we considered
the Ensemble miRNA?target regulation that were predicted by
at least one to five computational methods. Because there is no
miRNAs target PTEN in Ensemble-five, in total, nine sets of
miRNA?target regulation were considered here (Table 2).
Identifying RNA-RNA crosstalk
|
7
Table 2. The miRNA?gene regulation for 10 computational methods integrated with CLIP-seq data
Methods
TargetScan
miRanda
PicTar
PITA
RNA22
Ensemble-one
Ensemble-two
Ensemble-three
Ensemble-four
Ensemble-five
Regulation
mRNA
miRNA
PTEN-miRs
Gene2
Gene3
Gene5
98 829
297 873
82 983
63 096
88 480
423 975
117 441
60 038
26 803
3004
7160
12 294
6653
5805
8777
13 801
8410
5830
3864
1100
277
249
273
249
311
386
277
249
244
168
56
88
73
60
3
104
74
56
46
0
3944
9797
3750
3330
87
11 220
5163
3128
1814
0
3611
8961
3297
2752
6
10 328
4510
2617
1386
0
2655
7717
2514
1957
0
8989
3478
1861
941
0
Note: PTEN-miRs, the number of miRNAs regulated PTEN; Gene2, the number of genes that coregulated by at least two miRNAs with PTEN; Gene3, the number of genes
that coregulated by at least three miRNAs with PTEN; Gene5, the number of genes that coregulated by at least five miRNAs with PTEN; Ensemble-one to Ensemble-five,
the miRNA-regulation predicted by at least one to five computational methods.
Gene and miRNA expression of glioma. Genome-wide miRNA
and gene expression of 541 glioma samples were obtained from
the TCGA project [50]. In total, there are 12 042 genes and 470
miRNAs in the profiles.
Comparison based on literature-curated PTEN-related
ceRNAs
We first compared the five proposed computational methodbased 29 literature-curated ceRNA regulation of PTEN
(Figure 3A). In addition, eight methods (except RNA22) for identification of miRNA?gene regulation were considered in this
process. To exclude the bias of the number of predictions for
different methods, all candidate genes were ranked and we calculated the proportion of literature-curated PTEN ceRNAs in
top-ranked genes (Figure 3B?F). We found that for all five computational methods, the number of recalled ceRNAs was different. Globally, the prediction power was higher when the
ensemble miRNA?gene regulation was used. The performance
was the highest when using the miRNA?gene regulation predicted by at least four methods (Figure 3B?F). This is consistent
in five ceRNA prediction methods. This best performance might
be explained by the fact that ensemble method can obtain more
functionally enriched targets. In addition, we found that the
methods that integrated expression profiles (HyperC, SC and
CMI) had higher performance over the global predication ones
(ratio and HyperT) to identify the ceRNA regulation. Specifically,
we found that HyperC and CMI showed higher performance.
Although with similar performance, we found that the HyperC
is easier to understand for biological researchers, and the computational resource used by HyperC is much less than CMI.
These observations indicate that integration of the miRNA and
gene expression might identify the context-specific miRNA?
gene regulation, which further increase the performance for
identification of functional ceRNA regulation.
Comparison based on overexpression of PTEN
In addition to the literature-curated PTEN-related ceRNAs, we
evaluated the performance of these computational methods
based on a PTEN-overexpressed microarray analysis. Compared
with PTEN wild-type cell line, 84% of the literature-curated
ceRNAs showed increased expression when reintroduced PTEN
into the cell (Figure 4A). This observation validated the coexpression principle of ceRNA regulation and also indicated that
the genes with increased expression were likely to be PTEN-
related ceRNAs. Thus, we next used the genes with 1.5-fold
increased expression as a set of PTEN-related ceRNAs to evaluate the performance of these computational methods. By calculating the proportion of ceRNAs in top-ranked genes for each
method, we found that these methods reach the best performance when using the ensemble-four-method-retrieved miRNA?
gene regulation (Figure 4B?F). However, using the ensembleone-based miRNA?gene regulation, all of these methods show
the poorest performance. This might be because of the high
false positive of miRNA?gene regulation when using the union
set of different methods. These observations suggest that it is
critical to select the suitable miRNA?gene regulation for identifying the ceRNA regulation. In addition, when using miRNA?
gene regulation retrieved from the individual method, we found
that PicTar showed higher performance than other methods.
Similarly, the computational methods integrated with miRNA
or gene expression data also increased the performance for
identifying the ceRNA regulation. In addition, we also compared
these methods based on genes >2-fold in PTEN overexpression
data. Similar results were obtained (Figure 5A?E), and suggested
that it is better to identify the ceRNA regulation based on
ensemble-four-method and integration of miRNA and mRNA
expression profiles.
Overlap of different computational methods
The number of PTEN-related ceRNAs identified by the five
methods is different. Next, we compared the results of the five
ceRNA prediction methods based on the ensemble miRNA?gene
regulation that were predicted at least four-target prediction
methods. For the ratio-based and HyperC method, we obtained
the top-ranked 300 genes as candidate ceRNAs of PTEN. On the
other hand, we retrieved the ceRNAs with FDR <0.05 as the candidate ceRNAs for the HyperT, SC and CMI methods. We found
that only a small fraction of candidate ceRNAs were shared by
at least four methods, and no candidate ceRNAs were identified
by all five methods (Figure 6A). Specifically, the SC method identified few ceRNA candidates and the majority of these were not
covered by other methods. The ceRNA candidates identified by
ratio-based and HyperC were all covered by at least one of the
other methods. These observations imply that different computational methods may have their own merits.
As these computational methods might capture different
aspects of ceRNA regulation, we next explore whether the identified candidate ceRNAs were involved in same or similar
8
| Li et al.
Figure 4. Comparison of computational methods based on gene expression perturbation. (A) Literature-curated PTEN-related ceRNAs were highly expressed in PTEN
overexpressed cell line. (B?F) The proportion of validated ceRNAs of PTEN in top-rank candidate ceRNAs identified by different computational methods. (B) Ratio based.
(C) Hypergeometric test. (D) Hypergeometric test and coexpression. (E) SC based. (F) CMI. Different colored lines represent different miRNA?target interaction prediction
methods. This figure is based on 1.5-fold changes.
biological function. As there are few candidate ceRNAs identified by SC method, we compared the enriched functions of the
other four methods. Functional enrichment analysis revealed
that these ceRNA candidates play critical roles in cancer, and
several pathways were shared by more than two methods, such
as ?pathway in cancer?, ?endocytosis? and ?MAPK signaling pathway? (Figure 6B). In addition, we also identified several biological
processes shared by different methods, such as ?regulation of
protein serine/threonine kinase activity? and ?regulation of cell
cycle? were shared by all four methods (Figure 6C). These results
indicated that these methods were complementary with each
other, and it is best to integrate the results of different methods
to identify the ceRNA regulation in human complex diseases.
Discussions and future directions
With the past decades, miRNA-mediated ceRNA crosstalk has
been found to be involved in many diseases including cancer.
MiRNA-miRNA crosstalk and ceRNA-ceRNA crosstalk are changing our understanding the mechanisms of cancer [19]. In this
article, we have reviewed the recent developed computational
methods for identification of miRNA-mediated ceRNA interactions. Through the increasing application of high-throughput
sequencing data sets, ceRNA regulation continues to be discovered. However, the gap between identified and functionally
characterized ceRNA regulation remains considerably large.
One of the challenges for identification of miRNA-mediated
ceRNA regulation is the accuracy of the miRNA?gene regulation.
Different miRNA?target prediction methods only considered a
set of possible targets for miRNAs, which also included more
false-positive miRNA targets. Our analysis indicated that it is
better to integrate different miRNA?target prediction methods.
Specifically, the ensemble-four method gets the best performance in ceRNA prediction. In addition, integration of the context
miRNA-mRNA expression profiles increased the performance.
This suggested that context-specific miRNA?gene regulation is
useful in identifying the miRNA-mediated ceRNA crosstalk.
However, this might be challenged by the small number of samples with paired miRNA and gene expression profiles, especially
when we considered multiple types of RNAs.
As the research into miRNA-mediated ceRNA regulation has
just emerged, there is no gold standard positive ceRNA regulation to validate these proposed computational predictions.
Here, based on literature-curated PTEN-related ceRNAs and
PTEN-overexpression data sets, we evaluated these methods.
However, our evaluations in the case study are not sufficient for
drawing definite conclusions about the performances of these
methods. Our comparison results suggest that these methods
may have their own merits, and they capture the ceRNA candidates involved in similar functions. We suggest that it is better
to use them complementarily, e.g. by combining them to
develop an ensemble method [51].
Aside from the observations that ceRNA activities were
affected by the relative abundance of miRNA and RNAs, the
other mechanisms that regulated ceRNA interactions remain
poorly understood. Because miRNAs mainly bind to the 30 -UTR
of target RNAs to perform their functions, and 30 -UTR length
is observed to be highly regulated in various types of cancer
[52]. It is conceivable that ceRNA interaction could be altered
in cancer. However, there are limited studies to investigate the
Identifying RNA-RNA crosstalk
|
9
Figure 5. Comparison of computational methods based on gene expression perturbation. (A?E) The proportion of validated ceRNAs of PTEN in top-rank candidate
ceRNAs identified by different computational methods. (A) Ratio based. (B) Hypergeometric test. (C) Hypergeometric test and coexpression. (D) SC based. (E) CMI.
Different colored lines represent different miRNA?target interaction prediction methods. This figure is based on 2-fold changes.
30 -UTR-mediated ceRNA rewiring in cancer. In addition, genetic
variants have been observed to widely perturb miRNA regulation, which therefore further changed the dynamic equilibrium
of ceRNA regulation. Recently, one study identified a large number of genetic variants that are associated with ceRNA function
[53], which were termed as ?cerQTL?. This study suggests that
another function aspect of noncoding regulatory variants.
Besides the miRNA regulation, other posttranscriptional mechanisms (such as RNA editing and RNA-binding protein regulation) have been focused by recent studies. RNA editing could
influence miRNA target regulation [54] and thus influence
ceRNA interaction. RNA-binding protein (RBP) might compete
for binding sites of miRNAs [55], which could also affect the
ceRNA interactions. However, these types of regulation were
not considered in current methods for identifying ceRNA interactions. Moreover, intra-tumor heterogeneity is critical for
development effective methods for therapy. So far, the majority
of ceRNA studies have been performed at a cell-population
level. With the development of single-cell techniques [56], it
may be able to shed light to the contribution of ceRNA regulation in tumor heterogeneity.
Furthermore, although most of the ceRNA interactions identified so far are between binary RNA partners, increasing evidence has indicated that ceRNA crosstalk are formed as large
interconnected networks. In addition to direct interactions
through shared miRNAs, secondary indirect interactions have
been shown to contribute ceRNA regulation [45]. In addition,
ceRNA regulation and transcription regulation have been
shown to be tightly coupled, adding the complexity of ceRNA
crosstalk [18]. Evidence has also demonstrated that integration
of protein?protein interaction information can help to understand how miRNA sponges influence the downstream biological
processes [57]. However, how to integrate these functional
information into the process for identification of ceRNA interaction remain poorly understood.
In summary, analysis of ceRNA interactions and crosstalk in
intertwined networks may represent a robust platform for
understanding miRNA regulation in human complex diseases.
Here, we proposed that the application of computational techniques provides valuable functional and mechanistic insight into
miRNA-mediated ceRNA regulation. There are both opportunities and challenges for developing computational methods for
identification miRNA-mediated ceRNA crosstalk in future
studies.
Key Points
? MiRNA-mediated ceRNA regulation plays critical roles
in complex diseases.
? Computational methods for identification of RNA-RNA
crosstalk were reviewed.
? Integration of different miRNA target identification
methods and context-specific expression facilitates
identification of ceRNA regulation.
? Different computational methods are complementary
for identifying ceRNAs involved in similar biological
functions.
10
|
Li et al.
Figure 6. Overlap of candidate ceRNAs identified by different computational methods. (A) The overlap of ceRNAs for different methods. (B) and (C) The pathways and
biological processes enriched by candidate ceRNAs identified by ratio-based, hypergeometric test, hypergeometric test and coexpression and CMI methods. The size of
the circles is corresponding to the proportion of candidate ceRNAs, and the colors represent different P-values.
Fundings
The National Natural Science Foundation of China (grant numbers 31571331 and 61502126), the China Postdoctoral Science
Foundation (grant numbers 2016T90309, 2015M571436 and LBHZ14134), Natural Science Foundation of Heilongjiang Province
(grant number QC2015020), Weihan Yu Youth Science Fund
Project of Harbin Medical University, Harbin Special Funds of
Innovative Talents on Science and Technology Research Project
(grant number 2015RAQXJ091).
References
1. Esquela-Kerscher A, Slack FJ. Oncomirs?microRNAs with a
role in cancer. Nat Rev Cancer 2006;6:259?69.
2. Pasquinelli AE. MicroRNAs and their targets: recognition, regulation and an emerging reciprocal relationship. Nat Rev
Genet 2012;13:271?82.
3. Jonas S, Izaurralde E. Towards a molecular understanding of
microRNA-mediated gene silencing. Nat Rev Genet 2015;16:
421?33.
4. Ozsolak F, Milos PM. RNA sequencing: advances, challenges
and opportunities. Nat Rev Genet 2011;12:87?98.
5. Cancer Genome Atlas Research Network, Weinstein JN,
Collisson EA, et al. The cancer genome Atlas Pan-Cancer analysis project. Nat Genet 2013;45:1113?20.
6. Zhang J, Baran J, Cros A, et al. International cancer genome
consortium data portal?a one-stop shop for cancer genomics
data. Database 2011;2011:bar026.
7. Friedman RC, Farh KK, Burge CB, et al. Most mammalian
mRNAs are conserved targets of microRNAs. Genome Res 2009;
19:92?105.
8. Bajan S, Hutvagner G. Regulation of miRNA processing and
miRNA mediated gene repression in cancer. Microrna 2014;3:
10?17.
9. Xu J, Li CX, Li YS, et al. MiRNA-miRNA synergistic network:
construction via co-regulating functional modules and disease miRNA topological features. Nucleic Acids Res 2011;39:
825?36.
10. Bracken CP, Scott HS, Goodall GJ. A network-biology perspective of microRNA function and dysfunction in cancer. Nat Rev
Genet 2016;17:719?32.
11. Li Y, Xu J, Chen H, et al. Comprehensive analysis of the functional microRNA-mRNA regulatory network identifies miRNA
signatures associated with glioma malignant progression.
Nucleic Acids Res 2013;41:e203.
Identifying RNA-RNA crosstalk
12. Gosline SJ, Gurtan AM, JnBaptiste CK, et al. Elucidating
MicroRNA regulatory networks using transcriptional, posttranscriptional, and histone modification measurements. Cell
Rep 2016;14:310?19.
13. Olena AF, Patton JG. Genomic organization of microRNAs.
J Cell Physiol 2010;222:540?5.
14. Wang Y, Luo J, Zhang H, et al. microRNAs in the same clusters
evolve to coordinately regulate functionally related genes.
Mol Biol Evol 2016;33:2232?47.
15. Li Y, Li S, Chen J, et al. Comparative epigenetic analyses reveal
distinct patterns of oncogenic pathways activation in breast
cancer subtypes. Hum Mol Genet 2014;23:5378?93.
16. Xu J, Shao T, Ding N, et al. miRNA-miRNA crosstalk: from
genomics to phenomics. Brief Bioinform 2016, pii: bbw073.
[Epub ahead of print].
17. Meng X, Wang J, Yuan C, et al. CancerNet: a database for
decoding multilevel molecular interactions across diverse
cancer types. Oncogenesis 2015;4:e177.
18. Karreth FA, Pandolfi PP. ceRNA cross-talk in cancer: when cebling rivalries go awry. Cancer Discov 2013;3:1113?21.
19. Wang Y, Hou J, He D, et al. The emerging function and mechanism of ceRNAs in cancer. Trends Genet 2016;32:211?24.
20. Salmena L, Poliseno L, Tay Y, et al. A ceRNA hypothesis: the
Rosetta Stone of a hidden RNA language? Cell 2011;146:353?8.
21. Chen J, Xu J, Li Y, et al. Competing endogenous RNA network
analysis identifies critical genes among the different breast
cancer subtypes. Oncotarget 2017;8:10171?84.
22. Xu J, Feng L, Han Z, et al. Extensive ceRNA-ceRNA interaction
networks mediated by miRNAs regulate development in multiple rhesus tissues. Nucleic Acids Res 2016;44:9438?51.
23. Yu G, Yao W, Gumireddy K, et al. Pseudogene PTENP1 functions as a competing endogenous RNA to suppress clear-cell
renal cell carcinoma progression. Mol Cancer Ther 2014;13:
3086?97.
24. Cesana M, Cacchiarelli D, Legnini I, et al. A long noncoding
RNA controls muscle differentiation by functioning as a competing endogenous RNA. Cell 2011;147:358?69.
25. Fu Z, Li G, Li Z, et al. Endogenous miRNA Sponge LincRNAROR promotes proliferation, invasion and stem cell-like phenotype of pancreatic cancer cells. Cell Death Discov 2017;3:
17004.
26. Qi X, Zhang DH, Wu N, et al. ceRNA in cancer: possible functions and clinical implications. J Med Genet 2015;52:710?18.
27. Tay Y, Rinn J, Pandolfi PP. The multilayered complexity of
ceRNA crosstalk and competition. Nature 2014;505:344?52.
28. Xu J, Li Y, Lu J, et al. The mRNA related ceRNA-ceRNA landscape and significance across 20 major cancer types. Nucleic
Acids Res 2015;43:8169?82.
29. Wang P, Ning S, Zhang Y, et al. Identification of lncRNAassociated competing triplets reveals global patterns and
prognostic markers for cancer. Nucleic Acids Res 2015;43:
3478?89.
30. Chiu HS, Llobet-Navas D, Yang X, et al. Cupid: simultaneous
reconstruction of microRNA-target and ceRNA networks.
Genome Res 2015;25:257?67.
31. Zhou X, Liu J, Wang W. Construction and investigation of
breast-cancer-specific ceRNA network based on the mRNA
and miRNA expression data. IET Syst Biol 2014;8:96?103.
32. Le TD, Zhang J, Liu L, et al. Computational methods for identifying miRNA sponge interactions. Brief Bioinform 2017;18:
577?90.
33. Shao T, Wu A, Chen J, et al. Identification of module biomarkers from the dysregulated ceRNA-ceRNA interaction
|
11
network in lung adenocarcinoma. Mol Biosyst 2015;11:
3048?58.
34. Paci P, Colombo T, Farina L. Computational analysis identifies
a sponge interaction network between long non-coding RNAs
and messenger RNAs in human breast cancer. BMC Syst Biol
2014;8:83.
35. Zhang Y, Xu Y, Feng L, et al. Comprehensive characterization
of lncRNA-mRNA related ceRNA network across 12 major
cancers. Oncotarget 2016;7:64148?67.
36. Wang JB, Liu FH, Chen JH, et al. Identifying survivalassociated modules from the dysregulated triplet network in
glioblastoma multiforme. J Cancer Res Clin Oncol 2017;143:
661?71.
37. Agarwal V, Bell GW, Nam JW, et al. Predicting effective
microRNA target sites in mammalian mRNAs. eLife 2015;4:
e05005.
38. Betel D, Koppal A, Agius P, et al. Comprehensive modeling of
microRNA targets predicts functional non-conserved and
non-canonical sites. Genome Biol 2010;11:R90.
39. Krek A, Grun D, Poy MN, et al. Combinatorial microRNA target
predictions. Nat Genet 2005;37:495?500.
40. Kertesz M, Iovino N, Unnerstall U, et al. The role of site accessibility in microRNA target recognition. Nat Genet 2007;39:
1278?84.
41. Miranda KC, Huynh T, Tay Y, et al. A pattern-based method
for the identification of MicroRNA binding sites and their corresponding heteroduplexes. Cell 2006;126:1203?17.
42. Clark PM, Loher P, Quann K, et al. Argonaute CLIP-seq reveals
miRNA targetome diversity across tissue types. Sci Rep 2014;4:
5947.
43. Moore MJ, Zhang C, Gantman EC, et al. Mapping Argonaute
and conventional RNA-binding protein interactions with
RNA at single-nucleotide resolution using HITS-CLIP and
CIMS analysis. Nat Protoc 2014;9:263?93.
44. Friedersdorf MB, Keene JD. Advancing the functional utility of
PAR-CLIP by quantifying background binding to mRNAs and
lncRNAs. Genome Biol 2014;15:R2.
45. Ala U, Karreth FA, Bosia C, et al. Integrated transcriptional
and competitive endogenous RNA networks are crossregulated in permissive molecular environments. Proc Natl
Acad Sci USA 2013;110:7154?9.
46. Chiu YC, Hsiao TH, Chen Y, et al. Parameter optimization for
constructing competing endogenous RNA regulatory network
in glioblastoma multiforme and other cancers. BMC Genomics
2015;16(Suppl 4):S1.
47. Sumazin P, Yang X, Chiu HS, et al. An extensive microRNAmediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma. Cell 2011;147:
370?81.
48. Yang JH, Li JH, Shao P, et al. starBase: a database for exploring
microRNA-mRNA interaction maps from Argonaute CLIPSeq and Degradome-seq data. Nucleic Acids Res 2011;39:
D202?9.
49. Li JH, Liu S, Zhou H, et al. starBase v2.0: decoding miRNAceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res 2014;
42:D92?7.
50. Cancer Genome Atlas Research Network. Comprehensive
genomic characterization defines human glioblastoma genes
and core pathways. Nature 2008;455:1061?8.
51. Marbach D, Costello JC, Kuffner R, et al. Wisdom of crowds for
robust gene network inference. Nat Methods 2012;9:796?804.
12
|
Li et al.
52. Mayr C, Bartel DP. Widespread shortening of 3?UTRs by alternative cleavage and polyadenylation activates oncogenes in
cancer cells. Cell 2009;138:673?84.
53. Li MJ, Zhang J, Liang Q, et al. Exploring genetic associations
with ceRNA regulation in the human genome. Nucleic Acids
Res 2017;45:5653?65.
54. Wang Y, Xu X, Yu S, et al. Systematic characterization of A-toI RNA editing hotspots in microRNAs across human cancers.
Genome Res 2017;27:1112?25.
55. Treiber T, Treiber N, Plessmann U, et al. A compendium of
RNA-binding proteins that regulate MicroRNA biogenesis.
Mol Cell 2017;66:270?84.e213.
56. Ramskold D, Luo S, Wang YC, et al. Full-length mRNA-Seq
from single-cell levels of RNA and individual circulating
tumor cells. Nat Biotechnol 2012;30:777?82.
57. Zhang J, Le TD, Liu L, et al. Inferring miRNA sponge coregulation of protein-protein interactions in human breast
cancer. BMC Bioinformatics 2017;18:243.
Документ
Категория
Без категории
Просмотров
3
Размер файла
1 397 Кб
Теги
bib, 2fbbx137
1/--страниц
Пожаловаться на содержимое документа