节点文献

自维护数据仓库的研究和设计

Research and Design on Self-Maintainable Data Warehouse

【作者】 张本宏

【导师】 孙家启;

【作者基本信息】 合肥工业大学 , 计算机软件与理论, 2002, 硕士

【摘要】 为了对一个或多个数据源聚集数据快速查询,我们通常在数据仓库中保存物化视图。当基本数据发生变化时,我们必须对视图进行维护,以保持两者的一致。常用的维护方法有两种:增量维护和重计算,两者都需要对数据源进行查询。在数据仓库环境下,由于数据源可能是分布的,也可能是异构的,查询通常比较困难,而且效率不高,有时数据源的数据甚至不可用,为此人们希望使视图能够自维护。由于大多数视图不能自维护,为了实现视图的自维护,我们使用了辅助视图的方法。 本文在介绍自维护算法之前,先讨论了基本视图的自维护性,然后用二叉表示树对物化视图的结构进行了分析,在此基础上,给出了视图自维护的算法:1.采用由顶向下的方法,确定辅助视图;2.采用由底向上的方法计算各个结点的更新值;3.利用更新值对每个辅助视图和物化视图进行更新。 在本文的第四章,我们对自维护数据仓库的一些关键问题进行了研究,包括重复语义的处理、分布和必须处理,数据的备份和恢复、安全控制方法,特别需要说明的是,对于数据恢复,我们使用了一个与传统日志文件不同的方法,该方法可以大幅度减少辅助空间、提高维护效率。在本文的最后一章,我们实现了一个基于自维护算法的数据仓库小型系统。在结束语中,对本文进行了总结,并指出了进一步的研究方向。

【Abstract】 A data warehouse stores materialized view over data from one or more sources in order to fast access to integrated data. These views need to be maintained in response to updates in the source data. This is often done using incremental techniques and re-computation from the scratch that must access data from underlying source. But in the data warehouse scenario,accessing source can be difficult,low efficient and expensive,sometimes data may be unavailable,since the source are distributed or heterogeneous. For these reasons,the problem of materialized views self-maintenance has received increasing attention. However,not all the views defined by relation and group operations are self-maintainable,we must use auxiliary relations in data warehouse in order to self-maintain views.In this thesis,first 1 research on the self-maintainability of simple views that only use one operation. Then I propose an efficient self-maintainable algorithm based on analysis of binary express of views. The algorithm includes three steps:1. deciding auxiliary relations in an up-down fashion according to simple views self-maintainability. 2. computing each node updates in a bottom-up fashion. 3. updating the materialized views and auxiliary relations.In the 4th chapter,I study some key questions involving bag scenario,distributed and parallel computing issues,data backup and restore and security control. In particular,for data restore,I use a different method with traditional log file one,which can decrease auxiliary space and improve maintenance efficiency on a large scale. In the last chapter,I implement a mini-type system using the self-maintainable algorithm. At the end of the thesis,I give the summary and point out further direction of research on view self-maintenance.

【关键词】 数据仓库视图维护自维护
【Key words】 Data WarehouseView MaintenanceSelf-Maintenance
  • 【分类号】TP311.13
  • 【被引频次】1
  • 【下载频次】90
节点文献中: 

本文链接的文献网络图示:

本文的引文网络