节点文献

GoPipe:批量序列的Gene Ontology注释和统计分析(英文)

GoPipe: Streamlined Gene Ontology Annotationfor Batch Anonymous Sequences With Statistics

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 陈作舟薛成海朱晟周丰丰XUEFENG BRUCE LING刘国平陈良标

【Author】 CHEN Zuo-Zhou1,2), XUE Cheng-Hai3), ZHU Sheng1, 2), ZHOU Feng-Feng4), XUEFENG BRUCE LING5), LIU Guo-Ping6), CHEN Liang-Biao2)**(1)College of Life Science, Zhejiang University, Hangzhou 310029, China;2)Institute of Genetics and Developmental Biology, The Chinese Academy of Sciences, Beijing 100080, China;3)Institute of Automation, The Chinese Academy of Sciences, Beijing 100080, China;4)Department of Computer Science, National High Performance Computing Center, University of Science and Technology of China, Hefei 230027, China;5)Tularik Inc., 1120 Veterans Blvd, South San Francisco, CA 94080, USA;6)School of Electronics, University of Glamorgan, Pontypridd CF37 1DL, UK)

【机构】 浙江大学生命科学院,中国科学院自动化研究所,浙江大学生命科学院,中国科学技术大学计算机系,国家高性能计算中心,Tularik Inc., 1120 Veterans Blvd, South San Francisco, CA 94080, USA,School of Electronics, University of Glamorgan, Pontypridd CF37 1DL, UK,中国科学院遗传与发育生物学研究所 杭州310029中国科学院遗传与发育生物学研究所,北京100080,北京100080,杭州310029中国科学院遗传与发育生物学研究所,北京100080,合肥230027,北京100080

【摘要】 随着后基因组时代的到来,批量的测序,特别是EST的测序,逐渐成为普通实验室的日常工作. 这些新的序列往往需要进行批量的Gene Ontology (GO)的注释及随后的统计分析. 但是目前除了Goblet以外,并没有软件适合对未知序列进行批量的GO注释,而GoBlet因为具有上载量的限制,以及仅仅利用BLAST作为预测工具,所以仍有许多不足之处. 开发了一个软件包GoPipe,通过整合BLAST和InterProScan的结果来进行序列注释,并提供了进一步作统计比较的工具. 主程序接收任意个BLAST和InterProScan的结果文件,并依次进行文本分析、数据整合、去除冗余、统计分析和显示等工作. 还提供了统计的工具来比较不同输入对GO的分布来挖掘生物学意义. 另外,在交集工作模式下,程序取InterProScan和BLAST结果的交集,在测试数据集中,其精确度达到99.1%,这大大超过了InterProScan本身对GO预测的精确度,而敏感度只是稍微下降. 较高的精确度、较快的速度和较大的灵活性使它成为对未知序列进行批量Gene Ontology注释的理想的工具. 上述软件包可以在网站(http://gopipe.fishgenome.org/ ) 免费获得或者与作者联系获取.

【Abstract】 Accelerated availability of new sequences, especially ESTs, calls for computational methods to link sequences with Gene Ontology (GO) terms in a batch mode. There is currently no program for such purpose except Goblet, an online tool which uses BLAST to interpret query sequence with proper GO terms, but has a restriction of upload sequence files less than 100 kilobytes in size. GoPipe is a standalone package that integrates BLAST and InterProScan results to obtain Gene Ontology annotation with built-in statistical options. GoPipe takes any number of BLAST and/or InterProScan output files simultaneously and launches jobs sequentially to perform parsing, data integration, redundancy removal, GO distributions calculation and graphic display. A very high annotation specificity of 99.1% was achieved for a test dataset when the program was run in the "intersection" mode, which intersects the BLAST and InterProScan results, outperforming the specificity (81.1%) obtained from the InterProScan only. Statistical tools are also provided to compare GO distributions between different inputs, so that GO distributions of different sets of batch sequences can be compared, and differentially represented GO terms can be easily displayed. High specificity, speed and flexibility make GoPipe an ideal tool for streamlined GO annotation for batch sequences. The package is freely available at http://gopipe.fishgenome.org/ or by contacting the authors.

【关键词】 GeneOntology功能基因组学ESTBLASTInterProScanGOA
【Key words】 Gene Ontologyfunctional genomicsESTBLASTInterProScanGOA
【基金】 国家自然科学基金资助项目(30330080) .~~
  • 【文献出处】 生物化学与生物物理进展 ,Progress In Biochemistry and Biophysics , 编辑部邮箱 ,2005年02期
  • 【分类号】Q7-3
  • 【被引频次】36
  • 【下载频次】656
节点文献中: 

本文链接的文献网络图示:

本文的引文网络