节点文献
针对字典序依赖的分布式数据修复
DISTRIBUTED DATA REPAIRING FOR LEXICOGRAPHICAL ORDER DEPENDENCE
【摘要】 字典序次序依赖用于表达数据上属性列间的次序关系。现实数据往往具有很大的规模而且包含错误。研究针对字典序次序依赖的分布式数据修复技术,目标是将数据修改为满足给定次序依赖定义的形式。基于Spark平台,设计和实现分布式修复算法,同时通过实验验证该方法的有效性和运行效率。
【Abstract】 Lexicographical order dependencies can define order specifications on lists of attributes. In practice, data are large and contain errors. This paper investigated the problem of distributed data repairing for lexicographical order dependencies, aiming at repairing data such that order dependencies defined on the data were satisfied. We designed and implemented distributed algorithms based on Spark framework, and conducted extensive experiments to verify the effectiveness and efficiency of our approach.
【关键词】 数据修复;
字典序次序依赖;
分布式计算;
【Key words】 Data repairing; Lexicographical order dependency; Distributed computing;
【Key words】 Data repairing; Lexicographical order dependency; Distributed computing;
【基金】 科技部重点研发计划项目(2018YFB1402600);上海市科委项目(19DZ2252800);国网上海市科技项目(52094020001A)
- 【文献出处】 计算机应用与软件 ,Computer Applications and Software , 编辑部邮箱 ,2023年09期
- 【分类号】TP311.13
- 【下载频次】4