节点文献
基于k均值分区的流数据高效密度聚类算法
Efficient Data Stream Clustering Algorithm Based on k-Means Partitioning and Density
【摘要】 数据流聚类是数据流挖掘研究的一个重要内容,已有的数据流聚类算法大多采用k中心点(均值)方法对数据进行聚类,不能对数据分布不规则以及高维空间数据流进行有效聚类.论文提出一种基于k均值分区的流数据密度聚类算法,先对数据流进行分区做k均值聚类生成中间聚类结果(均值参考点集),随后对这些均值参考点进行密度聚类,理论分析和实验结果表明算法可以有效解决数据分布不规则以及高维空间数据流聚类问题,算法是有效可行的.
【Abstract】 Data stream clustering is an important issue in data stream mining. Most of the existing algorithms adopted K medians (means) method to solve this problem, which are not suitable to address the problem of clustering high dimensional or abnormal distributed data streams. This article proposes a k-Means partitioning and density based data stream clustering algorithm—CLUSMD. The algorithm applies K means clustering on each partition of the data stream to generate mean reference point set, and subsequently density based clustering is applied to these reference points to get the clustering result of each periods. Theoretic analysis and experimental results showe that CLUSMD is effective and efficient.
【Key words】 data stream clustering; mean reference point; density based clustering;
- 【文献出处】 小型微型计算机系统 ,Mini-Micro Systems , 编辑部邮箱 ,2007年01期
- 【分类号】TP311.13
- 【被引频次】27
- 【下载频次】614