节点文献
数据挖掘算法研究与应用
【作者】 黎敏;
【导师】 王天明;
【作者基本信息】 大连理工大学 , 计算数学, 2004, 硕士
【摘要】 数据挖掘是近年来发展起来的新技术,通过数据挖掘,人们可以将知识发现的研究成果应用于实际数据处理中,为科学决策提供支持。目前数据挖掘逐渐发展成为一个多学科领域,涉及到多方面的技术,特别是和计算智能方法的结合越来越紧密。 本文首先介绍了数据挖掘的基本概念、任务、功能、应用及发展方向等。接着介绍了关联分析的基本概念、分类及经典的Apriori算法思想。然后,提出了一种基于双向关系的优对关联关系的挖掘算法。并详细介绍了这个规则的数学描述及数学关系,最后给出了算法,并用实例验证了算法。聚类算法是数据挖掘中的核心技术之一,在整个数据挖掘过程中有着非常重要的作用。聚类算法的选择取决于聚类的数据、聚类的目的和应用。本文通过对数据挖掘技术中的常用聚类分析方法进行了详细的对比,并从综合评价聚类算法的几个方面对常用的聚类方法作了比较分析。在此基础上提出一种改进的K-Means算法,主要是改进原来的算法对孤立点比较敏感的缺点。最后介绍了遗传算法的基本概念、数学理论和实现技术等,然后结合遗传算法的全局寻优能力和聚类分析的局部搜索能力,提出了一种混合的聚类算法。该算法能很好的改进聚类,从而得到较佳的聚类结果。
【Abstract】 Data Mining is a new technique, which have become increasingly popular in recent years. People can apply the research result of knowledge discovery to the data process that can support the science decision. Now data mining has become a subject, which involved lots of science domain and technology especially in combining with Computational Intelligence (CI).Firstly, this paper introduces the basic concept, tasks, functions, applications and development way of data mining. Secondly, this paper introduces the basic concept, classification and classical algorithm ideas such as Apriori of association analysis. Then, one kind of binary association rules are proposed, and the special properties of this relation in function are charactered and the algorithm of finding the binary association rules are also presented. Clustering method is one of the core techniques in data mining. It was very important in data mining process. How to choose a clustering algorithm is decided by the clustering data, aim and application. A detailed comparison which involved usual clustering algorithm in data mining was given, and a comparing analysis of usual clustering algorithm including five synthetic evaluating criterion is also given. Based on it, an improved algorithm of K-Means is proposed, it can conquer disadvantage that customary algorithm is effected by the isolated point. Lastly, this paper introduces the basic concept, mathematical theory and technology in application of genetic algorithm. Then, a hybrid algorithm of clustering in terms of genetic algorithm and clustering analysis is proposed. The new algorithm well improves clustering algorithm.
【Key words】 Data Mining; Association Rules; Clustering Analysis; K-Means Algorithm; Genetic Algorithm;
- 【网络出版投稿人】 大连理工大学 【网络出版年期】2004年 04期
- 【分类号】TP311.13
- 【被引频次】36
- 【下载频次】2649