节点文献
一种基于最大加权频繁项目集的数据库相似性判别算法
An Algorithm Based on Maximum Weighted Frequent Itemsets for Measuring Database Similarity
【Author】 YANG Ming~(1,2) and SUN Zhi-hui~2 1(Department of Computer Science and Engineering,Anhui University of Technology and Science,Wuhu 241000) 2(Department of Computer Science and Engineering,Southeast University,Nanjing 210096)
【机构】 安徽工程科技学院计算机科学与工程系; 东南大学计算机科学与工程系;
【摘要】 在引入最大加权频繁项目集之后,给出一种新的数据库相似性度量模型,并提出基于最大加权频繁项目集的数据库相似性度量算法.该算法可有效地改进基于最大频繁项目集的数据库相似性度量方法,提高数据库相似性度量准确性.在实际应用中,改进模型为分布多库环境下数据挖掘的数据准备提供有效的框架,因而具有重要的使用价值.
【Abstract】 For effectively mining useful rules,combining the individual databases into a single logical database is not appropriate in distributed environment.In this paper a concept of maximum weighted frequent itemsets is introduced,a novel model for measuring database similarity is given,and an algorithm based on maximum weighted frequent itemsets for measuring database similarity is presented.Once similar databases are clustered,each cluster can be independently mined to generate the appropriate rules for a given cluster.In real applications,the new model provides an effective framework for data preparation of data mining in distributed environment,and hence it has real importance.
【Key words】 data mining; maximum weighted frequent itemsets; measuring database similarity;
- 【会议录名称】 第二十一届中国数据库学术会议论文集(研究报告篇)
- 【会议名称】第二十一届中国数据库学术会议
- 【会议时间】2004-10-14
- 【会议地点】中国福建厦门
- 【分类号】TP311.13
- 【主办单位】中国计算机学会数据库专业委员会