节点文献
两种划分模式下多维索引的研究
Multidimensional Index Research Based on Two Division Methods
【作者】 段雄文;
【导师】 吴恒山;
【作者基本信息】 华中科技大学 , 计算机软件与理论, 2006, 硕士
【摘要】 随着医学、生物技术、宽带网络、地理信息等的不断发展,支持多维数据管理的数据库系统的研究正在逐步深入。多维索引方法就是对多维空间中的特征向量进行索引的方法。根据数据的划分组织方式,多维索引结构分为两大类:基于空间划分的索引结构和基于数据划分的索引结构。作为多维数据处理的核心问题,多维索引一直是研究的热点方向。四叉树是基于空间划分组织的索引机制中的典型代表,具有易生成、查询速度快、操作简单等特点。但它在建立索引前就要知道其空间对象的分布范围,对象的分布对查询与存储的效率有很大的影响。理论分析与实践表明:将叶子结点进行编码,就可以减少四叉树的结点,提高存储效率。针对这种情况,设计了一种基于叶子结点编码的四叉树邻域寻找算法,该算法将四叉树结构与叶编码结合,从而减少四叉树要存储的结点数,提高了存储效率;同时由于在叶子一级是采用位操作实现邻域寻找,使查询效率有所提高。希尔伯特曲线是基于数据划分组织的多维索引机制的典型代表,具有良好的空间聚集性与自相似性。但传统的希尔伯特曲线只能对二维与三维曲线进行编码和转化,极大限制了希尔伯特曲线多维索引的应用。通过研究二维与三维希尔伯特曲线及性质,总结其演变规律,设计了一种N维的希尔伯特单元码的编码,解决了希尔伯特曲线的维数限制,为N维希尔伯特曲线的进一步研究奠定了基础。基于以上研究成果,设计并实现了一个多维索引实验平台,该平台是一个开放框架,用来测试以上提到的多维索引结构,使用者可以将各种多维索引按照标准接口包装后动态地加入框架之中。
【Abstract】 With the rapid development of medicine, molecular biology, network, Gis spatiotemporal databases have been the focus of considerable research activities over a significant period. Multidimensional data index is a way to index the eigenvector in the multidimension space. According to the data’s division and organization, multidimensional data indexes can be divided to two classes: indexes based on space, indexes base on data .The multidimensional data index has been a very active research area over the last few years.Quadtree is the typical one of the indexes based on space who is characterized with easy building, quick queries, easy operation and etc. But the scope of the object must be known before the index is built, and the object conditions affect the efficiency of storage and query very much. Theory and practice prove that encoding the leaf nodes of quadtree reduces the quantity of the nodes and enhances the storage efficiency. A new way of Leaf-coding is designed, based on the coding, the algorithm for neighbor searching of leaf-coding quadtree is implemented. The method enhances the storage efficiency,and it can also make the neighbor searching using bitwise operating available at the leaf node level, enhance the query efficiency.Hilbert curve is the typical one of the indexes based on data which has good properties of space cluster and self-similarity. But tradional Hilbert curve can only be operated in 2-dimension or 3-dimension,which limits the applications of the Hilbert curve as a high-dimension index.Researching the 2-dimention and 3-dimension Hilbert curve’s properties,a new way of generating the cell of Hilbert curve is proposed ,which breaks the limit of the dimension and makes reseach of high-dimension Hilbert curve possible.Based on the research above,a experiment flat is designed which is used to test the indexes . It is an opened structure, and users can add their indexes into it by some rules anytime.
【Key words】 multidimensional index; space division; data division; quadtree; Hilbert curve;
- 【网络出版投稿人】 华中科技大学 【网络出版年期】2008年 03期
- 【分类号】TP311.12
- 【被引频次】2
- 【下载频次】172