节点文献
神经系统相关生物信息二级数据库的构建
Construction of Neural Bioinformation Secondary Database
【作者】 王攀;
【导师】 赵元弟;
【作者基本信息】 华中科技大学 , 生物医学工程, 2004, 硕士
【摘要】 二十一世纪是生命科学的世纪,近年来生物信息学得到了前所未有的发展,成为当今生命科学领域的前沿和热点。神经分子生物学的深入研究,生成了海量的数据,神经系统相关蛋白及其基因的数据搜集及整理势在必行。为此,首先提出一种具有较高自动化程度的生物信息二级数据库构建方法,其通过代理程序自动获取Internet上公共一级数据库的信息资源,实现二级数据库的数据收集和自动更新;同时,采用XML作为蛋白质和核酸序列信息数据的描述标准,将获取的Web信息以XML作为中间格式保存,通过解析提交到二级数据库并转换成为便于Web发布的HTML格式。这样既方便对语义的机器解析,又有效地保证入库信息的完整性,以便二级数据库开发人员在海量的信息源中迅速找到真正需要的数据信息,并灵活地加以应用,从而将更多的精力集中在更纯粹的生物信息处理上来。基于上述方法,对参与神经系统组成和功能活动的各种蛋白质及其基因数据进行搜集和分类整理,建立了一个形式简洁、专用性较强的神经系统相关生物信息二级数据库。该数据库包含了蛋白质及其核酸的序列信息、蛋白质的结构信息以及神经分子生物学常见的缩略短语等。整个系统的构建基于浪潮TS10000高性能集群服务器,采用Oracle 9i作为后台数据库管理系统,使用JSP、JavaBeans等技术开发各种应用程序,以Web形式进行发布。神经系统相关生物信息二级数据库的构建,为研究神经系统的生化、分子生物学特性和相关疾病的致病机理提供一个良好的研究平台。同时也为神经系统相关基因组数据可视化等进一步的生物信息学研究打下基础。
【Abstract】 21st century is the century of life science. Bioinformatics has gotten unprecedented development in recent years and has become the foreland and hotspot of the study of life science. Along with the rapid research on neural molecular biology, the collection and coordination of high-throughout data about neural system relative proteins and genes are necessary.Firstly, a method of building bioinformation secondary databases more automatically is proposed. Agent programs are used to retrieve data from Internet biological databases. Therefore, information can be collected and updated automatically in secondary databases. Besides, XML is adopted as the standard format for nucleotide and protein sequence data description and exchange. The required Web data are restored in XML, which can be parsed and submitted into the secondary database, and transformed into HTML due to publishing conveniently. Thus semantic analysis is simplified and data integrity is guaranteed. This method can help secondary database developers to collect the data rapidly that are really needed from magnanimous sources, use these data in a flexible way, thus pay more attention to more pure bioinformation processing.Through this method, a great deal of data about proteins and genes concerning neural system functions and activities are collected and arranged to build a brief and specialized Neural Bioinformation Secondary Database, which consists of 12 tables, includes sequences of proteins and nucleic acids, structures of proteins and familiar abbreviations about neural molecular biology. This system is established on the high performance cluster system of LangChao TS10000 and Oracle 9i. JSP, JavaBeans and relative technology are used to develop all kinds of applications and data are shared on Internet.The establishment of the Neural Bioinformation Secondary Database provides an excellent research platform of neural molecular biology research and further sets up the foundation of the following bioinformation research such as the visualization of neural system relative genome data.
【Key words】 bioinformatics; neural system; secondary database; agent program; XML;
- 【网络出版投稿人】 华中科技大学 【网络出版年期】2005年 02期
- 【分类号】TP311.13
- 【被引频次】2
- 【下载频次】303