节点文献
一种基于粗糙集的离散化算法
A Global Discretization Method Based on Rough Sets
【摘要】 粗糙集理论以其独特的数据约简能力在不确定信息处理的相关领域得到广泛关注和研究,而连续属性的离散化是粗糙集方法及其它归纳学习系统中的重要环节.将离散化视作一种信息概括、抽象和约简,利用粗糙集理论提出一种全局的离散化算法.算法通过定义一致性度量,实现全局离散,弥补了局部离散化 MDLP 方法引入不一致的缺陷.然后在保持一致性前提下,进一步对离散中分割点的冗余进行约简.实验采用 ID3和粗糙集分类工具ROSETTA 在多个大数据集上对提出的离散方法进行分类验证,实验结果表明该算法的有效性和优越性.
【Abstract】 Since rough sets theory unveils the dependency of data and implements data reduction,it has attracted much attention from more and more fields.Moreover,discretization of continuous attributes plays an important role in rough sets theory and other induction learning systems. Because discretization is viewed as a process of information generalization(or abstraction)and data reduction,a global discretization algorithm is proposed based on rough sets theory.It modifies the criterion of selecting the best cut points,and introduces inconsistency checking to preserve the fidelity of the original data,which changes the MDLP method into a global one.Then the reduction of cut points is performed to lead to small size learning model while keeping the consistency level.The proposed algorithm is tested on several data sets with ID3 and ROSETTA. Experimental results show that this method performs better than MDLP and it is also superior to processing continuous data directly without discretization.
- 【文献出处】 模式识别与人工智能 ,Pattern Recognition and Artificial Intelligence , 编辑部邮箱 ,2006年03期
- 【分类号】TP181
- 【被引频次】8
- 【下载频次】258