节点文献

多策略数据挖掘系统DBIN Miner的设计与并行数据挖掘技术的研究

Design of a Multi-strategy Data Mining System DBIN Miner and Research of Parallel Data Mining Technology

【作者】 孙涛

【导师】 李雄飞;

【作者基本信息】 吉林大学 , 计算机软件与理论, 2006, 硕士

【摘要】 随着计算机技术、数据库管理系统的广泛应用和发展,各行各业的数据库中日益积累下大量的数据并且趋于分散。如何利用这些数据,并从这样浩如烟海的数据中提取出有用的信息和知识,由此产生了一种应用领域广泛、实用价值巨大的计算机技术——数据挖掘(Data Mining)技术。在数据规模不断膨胀和分析需求日益增长的情况下,各种数据分析工具和数据挖掘系统不断被开发、研制产生。数据挖掘系统属于智能决策支持系统,是一种以信息技术为手段,应用管理科学、计算机科学及有关学科的理论和方法,结合具体行业的知识背景和历史数据,协助明确问题、修改完善模型、列举可能方案、进行分析比较,为管理者提供知识和模式,帮助管理者做出正确决策的智能人机交互信息系统。本文通过对数据挖掘技术和理论的学习,以及国内外数据挖掘系统产生和发展过程的研究,总结了国内外数据挖掘系统的特点,分析了国内数据挖掘系统发展遇到的问题,根据当今数据挖掘技术和数据挖掘系统的发展趋势,设计了一个多策略数据挖掘系统—DBIN Miner,针对目前数据挖掘系统面临的大规模海量数据的处理问题,对并行数据挖掘技术进行了研究,并实现了关联规则并行数据挖掘算法——Count Distribution算法,对算法的性能进行了分析,结合并行数据挖掘技术的发展趋势设计了系统的并行数据挖掘策略,分析了将来系统发展并行数据挖掘时应注意的问题。

【Abstract】 As the society coming into the information period, and the comprehensive application of the computer network and computer technology, the database in every industry accumulates substantive data increasingly. How to use these data and pick up useful information and knowledge from them to guide the production and distribution of the enterprises comes into being and develops a new computer technology—Data Mining Technology which is widely used and has tremendous practicality.Under the instance that the data scale expand rapidly and the analysis requirement increase continuously, many kinds of data analysis tools and data mining systems were researched and explored continually. There are a lot of successful data mining systems, such as SAS Enterprise Miner, SPSS Clementine, and IBM Intelligent Miner and so on. They have developed for more than ten years, and applied successfully in many business or scientific research areas. Not only guide the management and development of the corporations, bringing tremendous economy benefit, but also do a lot of contribution for the research of data mining by scientific research institutions. Data mining system is a bridge between the research and application of data mining, it do lots of effect for the spread of data mining.In this paper, by the study of the data mining theory, and the domestic and overseas research of the data mining systems’production and development, summarize the problem and development direction of the data mining system; design a multi-strategy data mining system--DBIN Miner. Aiming at the problem that how the data mining system deals with the tremendous substantive data, research the parallel data mining technology, and carry out the parallel association algorithm—Count Distribution algorithm. By the research problem which the parallel data mining technology encounter and the analysis of the development direction, we design the parallel data mining strategy of the DBIN Miner system. Finally, put out the issue of the system development in the future.In chapter one and two, introduce the data mining technology and data mining system. It contains the produce background and development actuality, and then analyzes the problem which the data mining system encounter during the development and the direction of it.

  • 【网络出版投稿人】 吉林大学
  • 【网络出版年期】2006年 10期
  • 【分类号】TP311.13
  • 【被引频次】3
  • 【下载频次】385
节点文献中: 

本文链接的文献网络图示:

本文的引文网络