节点文献

一种自适应序列长度的RNA二级结构深度预测方法

Adaptive Sequence Length Deep Method for Predicting RNA Secondary Structure

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 吴宏杰汤烨陆卫忠崔志明付保川GAO Zhen

【Author】 WU Hong-jie;TANG Ye;LU Wei-zhong;CUI Zhi-ming;FU Bao-chuan;GAO Zhen;School of Electronic and Information Engineering,Suzhou University of Science and Technology;Institute of Intelligent Information Processing and Application,Soochow University;School of Engineering Technology,McMaster University;

【通讯作者】 陆卫忠;

【机构】 苏州科技大学电子与信息工程学院苏州大学智能信息处理及应用研究所School of Engineering Technology McMaster University

【摘要】 RNA二级结构预测是结构生物信息学中的一个重要问题.带假结的RNA二级结构预测,由于复杂的假结结构,更是增加了预测的难度.传统的机器学习方法受限于学习模型的结构,输入特征数目必须固定.大部分方法将不同长度的序列统一截断后进行训练,这不仅导致有用信息丢失,而且并破坏了生物序列完整性.针对该问题提出了一种适应序列长度的深度递归神经网络模型,构造了序列长度自适应模块及训练算法,从而不需要截断.同时,由于实际样本比例不均衡,采用了动态加权方法进行改善.随后,在权威数据集RNA STRAND上与四种优秀方法进行了四组比较实验.实验结果表明,本方法的正确率和M atthew s相关系数比定长LSTM方法分别提高了1. 6%和3. 3%;比其它四种典型方法提高了13. 6%和14. 8%.

【Abstract】 RNA secondary structure prediction is an important issue in structural bioinformatics. The difficulty of RNA secondary structure prediction with pseudoknot is increased due to complicated structure of the pseudoknot. Traditional machine learning methods are restricted by the topologies of the models. The fixed shape of features make their input sequences truncated before training. It not only leads to the loss of valuable information but also destroys the integrity of biological sequence. To address this issue,an adaptive LSTM deep model which could automatically fit in with variation of sequence length was proposed,adaptive module and a new training algorithm was constructed. And dynamic weighting method is used to resolve the imbalance sample quantity. Subsequently,three comparative experiments were conducted with four excellent methods on the classical data set RNA STRAND. The experimental results showed that the accuracy and Matthews correlation coefficient of the method are 1. 6% and 3. 3% higher than the fixed length LSTM respectively,and higher than other four methods by 13. 6% and 14. 8% respectively.

【基金】 国家自然科学基金项目(61772357,61876217,61672371,61502329)资助;江苏省333人才项目资助;江苏省六大人才高峰项目(DZXX-010)资助;苏州市科技项目(SYG201704,SNG201610)资助
  • 【文献出处】 小型微型计算机系统 ,Journal of Chinese Computer Systems , 编辑部邮箱 ,2019年08期
  • 【分类号】TP183;Q522;Q811.4
  • 【被引频次】1
  • 【下载频次】169
节点文献中: 

本文链接的文献网络图示:

本文的引文网络