节点文献
基于多源数据的miRNA调控模块识别算法研究
The Research on Algorithm of Identifying miRNA Regulatory Modules Based on Multi-source Data
【作者】 刘斌;
【导师】 骆嘉伟;
【作者基本信息】 湖南大学 , 计算机科学与技术, 2016, 硕士
【摘要】 随着生物科技的迅猛发展,大量生物组学数据的出现,为研究生物分子功能提供了有力的支撑。如何利用这些生物数据挖掘出人类所需的信息成为研究人员前所未有的挑战。结合miRNA调控关系和蛋白质相互作用关系来识别miRNA调控模块,对于理解复杂生物系统中的分子组合效应、揭示导致复杂疾病发生的重要的miRNA和靶基因具有十分重要的意义。针对目前大多数miRNA调控模块识别算法需要提前设定模块数目这一问题,本文提出一种基于多类型数据的miRNA-mRNA调控模块识别算法MiRMD(miRNA-mRNAregulatory modules detection)。该算法首先通过整合表达谱数据、绑定位点信息等创建miRNA调控网络,然后检测该调控网络中联系紧密的核结构,再通过给各个核添加满足相应要求的miRNA和mRNA来扩充成为调控模块,最后过滤掉重叠率较高的模块。通过在三种癌症数据集上的实验发现相对于其他两种算法,MiRMD识别出的模块具有更好的MiMEC(miRNA-mRNA expression correlation)值和更加显著的GO富集性。而且MiRMD能识别出与癌症有着密切联系的模块。针对MiRMD算法主要是从单个miRNA与多个mRNA组成的模块出发进行核结构检测并进行重叠邻居拓展得到最终的miRNA调控模块,而没有对miRNA集合与mRNA集合之间的集体联系进行分析研究。本文中提出了一种基于miRNA集合与mRNA集合之间集体关系的miRNA调控模块识别算法CGR(collective group relationships)。该算法首先利用LASSO模型整合多数据源来构建加权miRNA调控网络,然后基于此构建加权的miRNA协同作用网络并对miRNA进行聚类,形成miRNA簇;再在蛋白质网络上对mRNA进行聚类,形成mRNA簇,最后通过miRNA和mRNA之间的调控关系,将联系紧密的miRNA簇和mRNA簇进行合并,得到最终的miRNA调控模块。在三组数据集上的实验表明该算法能识别到效果更好的miRNA调控模块。
【Abstract】 With the rapid development of biological technology,a large amount of biological data has emerged,which provide a powerful support for the study of the function of biological molecules.How to use these biological data to dig out the valuable information has become an unprecedented challenge for researchers.It has a very important significance to integrate the miRNA-mRNA regulatory interactions and protein-protein interactions to identify the miRNA-mRNA regulatory modules for understanding complex biological systems in molecular combination effect and revealing important miRNAs and target genes which are the causes of complex diseases.Most of methods of identifying miRNA-mRNA modules needed to predefine the number of modules.Therefore,in this study,a new algorithm called MiRMD(miRNA-mRNA regulatory modules detection)is presented to identify miRNA-mRNA regulatory modules.Firstly,a miRNA-mRNA regulatory network is constructed by using miRNA/mRNA expression profiles and the target site information,then core structures are detected in this network by merging cohesive modules.Next,some overlapping neighbor nodes are added into the cores according to the density.Finally,some overlap modules are filtered.The experimental results based on three cancers datasets show that miRNA-mRNA regulatory modules identified by MiRMD are more coherent and functional enriched than the other two methods according to MiMEC(miRNA-mRNA expression correlation)and GO enrichment.Particularly,modules that our method identified are strongly implicated in cancer.The algorithm MiRMD started from a module which contains a single miRNA and some target mRNAs to detect the core structures,then based on the overlapping neighbor expansion to form the final miRNA regulatory modules,however,which did not consider the collective group relationships between a group of miRNAs and a group of mRNAs.A method called CGR is proposed to discover miRNA-mRNA regulatory modules and reveal miRNA-mRNA regulatory relationships from the heterogeneous expression data based on the collective relationships.Fistly,a miRNA-miRNA synergy network is constructed according to the edge weight of miRNA-mRNA regulatory network,then some miRNA clusters are identified in the synergy network,next several mRNA clusters are identified in the protein-protein interaction network,finally the miRNA clusters and mRNA clusters are merged to form miRNA-mRNA regulatory modules according to the regulatory relationships.The experiments on three data sets prove that the miRNA regulatory modules identified by CGR can get better effect.
【Key words】 Regulatory Modules; Core Structures; Collective Group Relationships; Regulatory Network; Protein-protein Interaction;