节点文献

大规模分布数据的分阶段非线性聚类方法应用研究

Research on Large Scale Distribution Data Method of Nonlinear Clustering

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 丘威

【Author】 QIU Wei;School of Computer Science, Jiaying University;

【机构】 嘉应学院计算机学院

【摘要】 提出一种能够有效处理大规模分布的数据聚类问题且简化计算复杂度的分阶段非线性聚类方法,该算法包含两个阶段:首先将数据划分为若干个球形分布的子类,采用K近邻图理论对原始数据计算顶点能量并提取顶点攻能量样本;再采用K近邻算法对该高能量样本做一个划分,从而得到一个考虑高能量样本的粗划分同时估计出聚类的个数,最后,综合两次聚类结果整理得到最终聚类结果。该方法的主要优点是可以用来处理复杂聚类问题,算法较为稳定,并且在保持聚类正确率的同时,降低了大规模分布数据为相似性度量的计算代价。

【Abstract】 This paper propose a way to efficiently handle large-scale distributed data clustering problems and simplifies the computational complexity of nonlinear phased clustering method, this algorithm consists of two phases: First, the data is divided into several sub-categories of spherical distribution, using K neighbor graph theory to calculate the energy of the original data and extract the vertex vertices attack energy sample; then using K-nearest neighbor algorithm to do a sample of the high-energy division, resulting in a high-energy samples considered coarse division while the estimated number of clusters, and finally comprehensive results of the two clustering clustering results to get the final finishing. The main advantage of this method can be used to deal with complex clustering algorithm is more stable, and while maintaining the accuracy of clustering to reduce the computational cost of large-scale distribution of the similarity measure data.

【关键词】 流数据数据挖掘聚类非线性
【Key words】 manifold datadata miningclusteringnonlinear
【基金】 广东省自然科学基金项目(No.S2013010013307)的资助
  • 【文献出处】 电脑知识与技术 ,Computer Knowledge and Technology , 编辑部邮箱 ,2013年34期
  • 【分类号】TP311.13
  • 【被引频次】2
  • 【下载频次】41
节点文献中: 

本文链接的文献网络图示:

本文的引文网络