节点文献
基因表达式编程在多项式函数分解和并列函数关系挖掘中的研究和应用
Research and Applications of GEP in Polynomial Factorization and Parallel Function Mining
【作者】 汪锐;
【导师】 唐常杰;
【作者基本信息】 四川大学 , 计算机应用技术, 2005, 硕士
【摘要】 科学研究中人们希望发现数据中蕴含的规律。因此,找到一种高效、准确的函数关系发现方法也是数据挖掘方面的一个研究重点。本文对利用基因表达式编程(GEP)技术进行函数关系的挖掘进行了较深入的研究,利用该方法进行多项式函数关系的分解,并在此研究的基础上实现了基于多基因染色体的并列函数表达式挖掘,实验证明取得了很好的效果。其间主要完成了如下工作: 1. 阐述了数据挖掘的基本概念、一般流程及其分类方法等,分析了函数关系发现这一特殊知识发现形式的特点进行;提出了多项式函数关系分解以及在此基础之上的多函数关系挖掘的重要意义。2. 突破了传统的多项式分解方法,采用数据挖掘的思想,利用基因表达式编程GEP 技术提出了基于基因表达式编程的多项式函数关系分解方法GPF。3. 采用了有特色的概率相关因子优化GEP 中的适应度函数,使得GPF 方法精度提高了27%,同时GPF 提出了宽松环境进化策略LEE 使得GEP 成功率比传统技术提高了最大58 倍。4. 在多项式函数关系分解GPF 的基础上实现了观察数据集上的基于多基因染色体的并列函数表达式挖掘PPM。5. 利用Visual C++6.0 设计实现了基于GEP 的挖掘实验平台GEPM,其中包括多项式函数关系分解GPF 和多基因染色体的函数表达式挖掘PPM 的实现;
【Abstract】 In scientific research people are eager to discover the rules implied in the data. Therefore, proposing a high efficiency and exact way to execute function mining is also a research emphasis in data mining. This paper focuses on the function mining by Gene Expression Programming (GEP), proposing an approach to factorize polynomial functions, GPF (GEP Polynomial Factorization). Consequently, on the basis of GPF, an approach of parallel function mining based on polygene chromosome is also implemented. The main work of this paper includes: 1. Introduces the concept, work flow and the classification of data mining. Analyzes the characteristic of function mining. Proposes the significance of polynomial function factorization and the multiple function expression mining. 2. Proposes the GPF (GEP Polynomial Factorization) algorithm based on GEP (Gene Expression Programming) techniques in spite of the limits of traditional factorization methods to implement polynomial function factorization. 3. Optimize the fitness function in GEP by a truly original approach called probability correlation factor, improving the precision by 27%. And adopt a brand new strategy named LEE(Loose Environment Evolution )to improve the success-probability by 58 times compared with traditional approaches. 4. By extending the former GPF method proposes the Parallel Polynomial function Mining based on polygene chromosome on observation data set, named PPM. 5. Design and complete the experiment platform, named GEPM, based on GEP by using of Visual C++ 6.0, on which both the GPF and PPM function can be implemented. 6. Execute a serial of extensive experiment on GEPM. Compare and evaluate the performance of GPF in different input parameters by designed experiment criterion. Throw out the performance promotion by using the corresponding optimizing strategies. Demonstrate the actual efficiency of PPM. This article is organized as following: Chapter1 talks about the significance of function mining. Chapter2 introduces the concept, work flow of data mining especially function mining. Chapter3 analyzes and compares traditional gene algorithm and gene expression programming, proposes the approach and wok flow of polynomial factorization and parallel function mining. Chapter4 designs and proposes detail steps of GEP polynomial factorization. Chapter5 presents method and steps of parallel polynomial function mining by extending GPF. Chapter6 designs the specific GPF and PPM algorithm and implement the GEPM system containing the tow functions. Chapter7 executes experiments and analyses the results. Chapter8 concludes the work and research of this paper.
【Key words】 Data Mining; Polynomial Function Factorization; Parallel Polynomial Function Mining; GEP;
- 【网络出版投稿人】 四川大学 【网络出版年期】2005年 08期
- 【分类号】TP311.11
- 【被引频次】5
- 【下载频次】294