节点文献
分布式存储网络中的数据完整性校验与修复
Data Integrity Check and Repair in Distributed Storage Network
【作者】 刘刚;
【导师】 刘胜利;
【作者基本信息】 上海交通大学 , 计算机技术, 2012, 硕士
【摘要】 在分布式存储网络中,客户端在远程不可信服务器上存储了大容量文件并想通过某种方式验证存储的文件没有被篡改,数据完整性校验可以达到这一目的。我们考虑利用BLS签名,通过可信第三方实施数据完整性校验;我们同样考虑在服务器失效或者崩溃的时候,系统如何修复出错的存储数据。本文首先分析了已有的使用网络编码的分布式存储网络以及数据存储的可证明安全性PDP、PoR以及DPDP模型,然后针对分布式存储网络中大容量数据存储的情况,提出一种新的、有效的数据完整性校验与修复(Data IntegrityCheck and Repair,DICR)机制。本文的主要贡献在于:1.用户数据经过用户私钥加密之后再上传到网络存储服务器进行存储,提供数据保密性保证;2.使用网络编码实现数据分布式存储在不可信的网络存储服务器上,当有限数量的网络存储服务器失效时,系统可以恢复失效的数据,提供高可用性保证;3.引入可信第三方,使用公开审计的方法,由可信第三方代替用户验证存储数据的完整性,因此用户不用保持在线状态,增强了系统的灵活性;4.使用基于BLS的汇聚签名,减少了数据完整性校验时的计算量以及网络通信带宽;5.使用改进的Merkel Hash Tree,通过分布式算法,提供存储数据的有限动态更新操作。
【Abstract】 In distributed storage network, clients store large files on a remote network consistingof unreliable distributed servers, they want to verify that their files are properlystored in the servers without any modification. This can be achieved by the techniqueof data integrity check. We consider how to implement data integrity check by a thirdparty auditor (TPA) in a distributed storage network, with the help of BLS signature.We also consider how the distributed storage network restores data when some serverfails or some server crushes down.In this paper, we analysis the distributed storage network using network code andintegrity check schemas in PDP, PoR models. For massive data to be stored, we presenta new data integrity check and repair (DICR) scheme. Our contribution can be summarizedas:1. user’s data is encrypted before it’s sent to storage servers, which guarantees thesecurity of data;2. (n; k) linear network coding is used to allow data being stored at distributeduntrustworthy servers in a robust way.The distributed storage network restoresdata when some server fails or some server crushes down;3. data integrity check is implemented in public by a third party auditor (TPA) insteadof user, which is not always online,and enhances the flexibility of the system;4. BLS aggregate signature is used to reduce computational overhead of data integritycheck and network bandwidth; 5. a modified Merkel Hash Tree is used in a distributed way to provide limited datadynamic operations。
【Key words】 Distributed Storage Network; Linear Network Coding; Data IntegrityCheck; Data Repair;