节点文献

基于HDFS的海量遥感影像存储冗余机制的研究

Research on the Redundancy Mechanism of Remote Sensing Image with HDFS

【作者】 王亚楠

【导师】 马骏;

【作者基本信息】 河南大学 , 计算机应用技术, 2013, 硕士

【摘要】 海量遥感影像数据存储基本上采用的都是分布式存储方式。特别是在高分辨率数据存储系统中,为了保证数据的安全性、完备性和高可用性,需要提供一定的数据冗余技术。目前,传统的分布式文件存储系统中采用的数据冗余技术有三种:完全副本技术、磁盘阵列技术和纠删码编码冗余技术,完全副本和磁盘阵列这两种技术在提高系统冗余性的同时都会增加对系统存储空间的需求,纠删码编码冗余技术虽然能弥补存储空间过度消耗的缺陷,但同时也会增加系统I/O负担。针对上面三种方法的缺陷,本文采用完全复制技术和纠删码编码冗余技术相结合的方法来解决。在开源HDFS(HadoopDistributed File System)的基础上,本文将改进后的冗余机制替代HDFS原有的冗余机制来解决系统中存储空间与系统I/O负担之间的冲突问题,使整个系统在提高冗余性的同时能够保证系统I/O速度,并且可以极大地降低系统对存储空间的需求。本文重点研究了适合高分辨率遥感影像的数据冗余机制,提出了一种改进的冗余策略。主要工作与贡献如下。1.在研究海量遥感影像数据存储管理技术与数据冗余机制的基础上,主要研究了HDFS分布式文件系统及其冗余机制,重点分析了适合海量遥感影像存储的复制冗余技术和纠删码编码冗余技术。2.在复制冗余机制和纠删码编码冗余机制的基础上,提出了“复制+编码”的改进的HDFS冗余策略方法,并给出了文件的读写流程方案以及编码后系统中产生的编码块的管理方案。3.对改进的HDFS系统进行了实验,验证了所提出的改进方案的可行性并且实验结果表明系统在保证系统I/O速度的基础上,能够极大地降低系统对存储空间的需求。改进后的HDFS系统被成功应用到高分重大专项项目(ERSI-DBMS)的海量遥感影像数据存储系统中。

【Abstract】 The massive remote sensing image data are used basically in distributed storage mode. Especially inthe high resolution data storage system, we need to provide certain data redundancy technology to ensurethe safety, high availability and completeness of data.At present, there are three kinds of data redundancy technique: completely copy technology, disk arraytechnology and controlling delete code coding redundancy technology adopted in the traditional distributedfile storage system. Completely copy and disk array technology can improve system redundancy, but at thesame time it can also increase the demand for system storage space; rectifying delete redundant codecoding technology can make up for the defect of excessive consumption of storage space, but also increasethe burden of I/O operations. In view of the above three methods, this article combines complete replicationtechnology and controlling delete code coding redundancy technology to solve it.Based on the open-sourceHDFS (Hadoop Distributed File System), this paper uses improved redundancy mechanism instead oforiginal redundancy mechanism to slove the conflict between system storage space and system I/O burden,and can guarantee the system I/O speed, and at the same time reduce the demand for storage space insystem in improving the whole system of redundancy.This article researches mainly on a data redundancymechanism for high resolution remote sensing image and proposes an improved redundancy strategy. Mainwork and contributions are as follows.1. On the foundation of research for massive remote sensing image data storage technology and dataredundancy mechanism, this paper mainly researches the hadoop distributed file system andredundancy technique, analyzes for mass remote sensing image storage replication redundancymechanisms and controlling delete code coding redundancy technology.2. On the basis of copy redundancy mechanism and delete redundant code coding redundancymechanism, this paper propose a improved HDFS redundancy strategy in combining both of them, andputs forward the file to read and write process scheme and the code generated by the system after thecoding block management solution.3. The improved HDFS redundancy strategy has carried on the experimental verification to prove thefeasibility of the proposed improvement program, and it has been applied successfully in high resolution major project (ERSI-DBMS) of massive remote sensing image data stored system.

  • 【网络出版投稿人】 河南大学
  • 【网络出版年期】2014年 02期
  • 【分类号】TP333;TP751
  • 【被引频次】8
  • 【下载频次】285
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络