节点文献

面向图像检索的海量图像自动聚类方法研究

Image Retrieval Oriented Massive Images Automatic Clustering Methods Research

【作者】 杨杰

【导师】 苗振江;

【作者基本信息】 北京交通大学 , 信号与信息处理, 2015, 硕士

【摘要】 移动互联网的发展和多媒体技术的兴起,让图像数据正呈爆炸式增长。面对海量的图像数据,如何便捷有效地对图像库进行管理和检索,并在图像库中发掘出有价值的潜藏信息,正成为一个亟需解决的问题。而数据挖掘中的聚类分析技术和基于内容的图像检索技术的兴起和应用,为这个问题的解决带来契机。聚类分析可以实现对无标注样本的分类,是一种无监督的学习方法。而基于内容的检索脱离了原始的文本搜索方式,比起基于标注的检索更客观和便捷。为解决这个问题,本文尝试提出使用聚类的方法完成对海量图像库的初步管理。通过对图像库进行聚类操作实现类别挖掘和标注,同时使用基于内容的方式完成对图像库的检索。在此基础上,实现了一个图像库聚类和检索平台。并且针对传统图像特征和聚类算法的不足,提出了改进的特征提取算法和聚类算法。主要研究工作和创新内容为以下几个方面:(1)针对传统图像特征的不足提出改进方法。提出结合空间信息的分块颜色直方图和分块LBP特征;通过结合颜色信息提出Color-SIFT特征,弥补了SIFT特征只有灰度信息的缺点;提出综合Dense SIFT和颜色信息的联合特征,既有局部区域描述能力,同时保留了图像的全局信息,实验结果表明性能比其他特征都好。(2)以K均值算法为基础提出Vlini-Batch K均值,不仅提高了算法的稳定性同时大大提升了算法的速度。将核函数引入谱聚类中提出基于高斯核的谱聚类算法,提升了谱聚类算法的性能。(3)归纳并整理了相关图像特征,包括全局特征及局部特征。颜色特征有颜色直方图、颜色矩等;纹理特征有LBP特征、灰度共生矩阵等;形状特征有Hu矩、边缘直方图。局部特征方面主要讲解了经典的SIFT特征和SURF特征。(4)研究和学习相关论文后,综述了经典聚类算法和最新的聚类算法。对这些聚类算法的原理及聚类步骤进行了详细讲解,归纳总结了算法的优缺点。(5)设计并实现了面向检索的图像聚类平台,将归纳整理的图像特征和聚类算法实现并整合到平台之中,同时依据聚类分析的结果提高了检索速度。

【Abstract】 With the development of mobile Internet and multimedia technologies, it makes im-ages data being explosive growth. Faced with massive image data, how to manage and retrieve image library conveniently and effectively, and discover valuable information hidden in the image library are problems that need to be solved. With the help of clustering techniques in data mining and content-based image retrieval technology, bring opportunities to solve this problem. As a non-supervised learning method cluster-ing analysis can achieve classification with unlabeled samples. The content-based re-trieval gets rid of original text search, is more objective and convenient than annota-tion-based retrieval.To solve this problem, this paper attempts to propose a clustering method to com-plete initial management of massive image library. Through clustering over image data-base, it can achieve image category and annotation. And use content-based retrieval technology to retrieval image library. With thoughts described above, this paper builds an image database clustering and retrieval platform. And to overcome the insufficient of clustering algorithms and traditional image features, proposes improved feature extrac-tion algorithms and clustering algorithms. The main research work and innovative con-tent are described as the following content:(1) Propose an improved method against shortcomings of traditional image features. Proposed block color histogram with spatial information and block LBP fea-tures; Color-SIFT features proposed by combining color information, to make up shortcomings that SIFT features use only gray information; Proposed joint features which integrated Dense SIFT features and color information, it can describe the local area while preserving the global information of the image, experimental results show that its performance is better than other features.(2) Mini-Batch K-means is proposed based on K-means algorithm. It not only im-proves the stability of the algorithm, while greatly enhances the speed of the algorithm. The kernel function is introduced to spectral clustering algorithm and thus proposes spectral clustering based on Gaussian kernel. It can enhance the performance of spectral clustering algorithm.(3) Summarized and analyzed the relevant image features, including global features and local features. Color features are color histogram, color moment, etc. Tex-ture feature are LBP, GLCM etc. Shape features are Hu moments, edge histo- gram etc. Basically explained aspects of the local features of the classic fea-tures SIFT and SURF.(4) Research and study the relevant papers, reviews the classical clustering algo-rithms and the latest clustering algorithms. Explain these principles and cluster-ing step of clustering algorithm in detail. Summarize the advantages and disad-vantages of the algorithm.(5)Design and implements image retrieval clustering platform. Collate and analyze the image features and clustering algorithms. Integrated those algorithms into the platform. At the same time based on the results of cluster analysis improves retrieval speed.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络