节点文献
高分辨率光学遥感图像检索技术研究
Research on Technologies of High Resolution Optical Remote Sensing Image Retrieval
【作者】 彭晏飞;
【作者基本信息】 辽宁工程技术大学 , 矿山空间信息工程, 2019, 博士
【摘要】 随着遥感大数据时代的到来,快速且高精度的对海量高分辨率遥感图像数据进行组织、管理和检索,满足用户快速浏览和高效检索到感兴趣的目标,已成为在遥感信息获取中急需解决的问题。特征提取和相似性匹配作为遥感图像检索过程中两个重要环节,直接决定着检索的精度和速度,针对以上问题,对高分辨率光学遥感图像检索过程中的特征提取和相似性匹配展开研究,主要内容如下:(1)针对部分高分辨率光学遥感图像低照度,且内容复杂、细节信息丰富等特点,使用ImageNet预训练模型进行特征提取导致检索效果不佳问题,提出了一种结合图像增强与卷积神经网络的高分辨率遥感图像特征提取模型。采用三种图像增强算法和三种卷积神经网络结构分别在遥感图像数据集中通过ImageNet网络的预训练模型进行训练,训练后得出基于“图像增强+卷积神经网络”的9种遥感图像特征提取模型。通过检索实验对比分析,得出了一种特征提取效果较好的高分辨率遥感图像特征提取模型,有效的解决了目前利用普通图像数据集训练的卷积神经网络对高分辨率遥感图像特征提取效果不佳问题。(2)针对特征提取后图像特征维度高,导致检索速度和精度受到影响问题,提出了一种结合PCA和t-SNE的特征向量优化方法。对特征提取后的高维数据初次线性降维至128维,保证了高维空间的图像特征映射到低维空间的时间,使检索速度得到提升,再对初次降维后得到的特征向量进行二次非线性降维至64维,在低维数据中将其坐标作为t分布,使各簇之间的距离拉大,得到较优解,保证了高维空间的图像特征映射到低维空间的准确性,使得检索精度得以提升。(3)针对采用单一距离公式进行相似性匹配检索精度不高问题,提出了一种多距离结合的Top-k排序方法。将查询图像与遥感图像库中其它图像特征向量之间求出的4类距离向量从小到大快速排序,从排序结果中选取每个距离中的前k个组成相似度矩阵,分别为4类距离的前k个元素分配权值,对于每一个距离的前k个元素按照从小到大分配权值1到k,将4类距离中前k个元素中相同元素的权值加和,将权值加和后的结果进行升序排列,选取前k个权值加和小的元素作为结果输出,提高了检索精度和速度。(4)针对高分辨率光学遥感图像同类图像之间相似度较低、不同类图像之间相似性较高的现象,导致检索时要依次比较此类情况的图像相似性距离耗时问题,提出了一种基于改进模糊C均值聚类的高分辨率遥感图像检索方法。对特征提取和优化后的向量分别计算4种距离,对4种距离特征向量进行归一化处理,得到相似度矩阵,将其作为FCM的输入,对图像特征向量进行聚类,降低了聚类时所输入的维度,将聚类后的结果采用多距离结合的Top-k排序方法进行相似性匹配,有效减少排序中的时间复杂度,加快了相似性匹配速度。通过在两个高分辨率遥感图像数据集上的实验结果表明,此方法能够有效地提高检索精度和速度,取得良好的检索效果和性能。(5)针对SVM分类时训练样本多导致训练最优超平面耗时,训练样本中反例样本过多影响分类超平面的构建,对SVM模型的测试得出错误分类结果问题,提出了一种采用多距离结合的Top-k排序方法来对SVM训练样本集进行筛选方法。SVM分类前采用多距离结合的Top-k排序方法合理筛选训练集,将筛选后的训练集来训练SVM的最优超平面,根据测试样本数据到分类超平面距离的排序结果得到最终检索结果,此方法减少了训练集样本个数,同时,将大多数与查询图像不相似的图像筛选出去,避免不相似图像较多对分类结果的影响。通过在两个高分辨率遥感图像数据集上的实验结果表明,本方法能够有效地提高检索精度和速度,在检索效果和性能上具有一定的优势。(6)针对部分类别的遥感图像初次检索精度不能满足用户要求的问题,提出了一种根据距离评价标准来进行相关反馈的方法。对前次返回结果中的正例样本图像不进行标记,只对最不相似的图像进行反例标记,减少了用户的标记次数;标记采用小样本标记,对满足距离评价标准的图像进行调整,避免了多次反馈重新训练最优超平面耗时的现象,通过迭代策略减少了反馈次数。实验结果表明,本方法利用较少的反馈次数即可得到理想的检索结果。该论文有图67幅,表30个,参考文献137篇。
【Abstract】 With the advent of the era of remote sensing big data,the rapid and high-precision to organization,management and retrieval of massive high-resolution remote sensing image data,whose goal is to meet users’ quick browsing and efficient retrieval of interesting targets,has become an urgent problem in the acquisition of remote sensing information.Feature extraction and similarity matching are two important processes of remoting sensing image retrieval,which directly determine the accuracy and the speed of retrieval.To solve the above problems,the researches on feature extraction and similarity matching in high-resolution optical remote sensing image retrieval processes are studied.The main contents are as follows.(1)Aiming at the problem that the ImageNet pre-training model to extract features leads to poor retrieval performance due to some high-resolution optical remote sensing images have some characteristics,such as low illumination,complex content and rich details,a high-resolution feature extraction model combining with image enhancement and convolutional neural network for remote sensing images is proposed.Three image enhancement algorithms and three convolutional neural network structures were trained on the remote sensing image dataset through the pre-training model of ImageNet network.After training,nine types of remote sensing image features based on "image enhancement+convolutional neural network" were obtained.Through comparative analysis of retrieval experiments,a high-resolution remote sensing image feature extraction model with the better feature extraction effect is obtained,which effectively solves the problem of feature extraction of high-resolution remote sensing images using convolutional neural networks trained with ordinary image datasets.(2)Aiming at the problem that the feature dimension of the image is high after feature extraction,leading to the influences of the retrieval speed and accuracy,a feature vector optimization method combining PCA and t-SNE is proposed.The high-dimensional data after feature extraction is linearly reduced to 128 dimensions for the first time,which ensures the time for image features in high-dimensional space to be mapped to low-dimensional space,and the retrieval speed is improved.Then,the feature vector obtained is reduced to 64 dimensions by second-order nonlinear dimension reduction and its coordinates are used as the distribution of t in the low-dimensional data,so that the distance between the clusters is extended to obtain a better solution.This guarantees the accuracy of mapping the image features in the high-dimensional space to the low-dimensional space,and improves the retrieval accuracy.(3)Aiming at the problem that the retrieval precision of similarity matching using a single distance formula is not high,a multi-distance combined Top-k ranking method is proposed.The four types of distance vectors between the query image and other image feature vectors in the remote sensing image library are sorted from small to large by using Quicksort,and the first k of each distance are selected from the ranking results to form a similarity matrix.The weights are assigned to the first k elements of the four types of distances,and the weights of 1 to k are assigned to the first k elements of each distance from small to large.The weights of the same elements in the first k elements in the four types of distances are summed,and the results of the weighted sums are sorted in ascending order.The first k weights with small sum are selected as the result output,which improves the retrieval accuracy and speed.(4)Aiming at the time-consuming problem of image similarity distance comparison due to the similarity between similar types of images in high-resolution optical remote sensing images is low and the similarity between different types of images is high,a high-resolution remote sensing image retrieval method based on improved fuzzy C-means clustering is proposed.This method can calculate four kinds of distances after featuring extraction and optimization,and normalize processing these four distance feature vectors to obtain the similarity matrix,which is used as the input of FCM.Clustering image feature vectors reduces the input dimensions during clustering.The clustered results are matched by the multi-distance combined Top-k ranking method,which effectively reduces the time complexity in ranking and speeds up similarity matching.The experimental results on two high-resolution remote sensing image datasets show that this proposed method can effectively improve the retrieval accuracy and speed,and achieve good retrieval results and performance.(5)Aiming at the problem that training samples during SVM classification lead to time-consuming training of the optimal hyperplane,too many negative examples in the training sample affect the construction of the classification hyperplane,and the test of the SVM model yields the problem of incorrect classification results,a multi-range combined Top-k ranking method is proposed to filter the SVM training sample set.Before the SVM classification,the multi-distance combined Top-k ranking method is used to properly filter the training set and the filtered training set is used to train the optimal hyperplane of the SVM.The final search result is obtained according to the ranking results of the test sample data to the classified hyperplane distance.This method can reduce the number of samples in the training set.and at the same time,filter out most images that are not similar to the query images to avoid the influence of more dissimilar images on the classification results.The experimental results on two high-resolution remote sensing image datasets show that the proposed method can effectively improve the retrieval accuracy and speed,and has certain advantages in retrieval results and performance.(6)Aiming at the problem that the accuracy of the initial retrieval of some kinds of remote sensing images cannot meet the user’s requirements,a method of relevance feedback based on distance evaluation criteria is proposed.The positive sample images in the previous return result are not labeled,and only the least similar images are labeled with negative examples,which reduces the number of labeling by the user.The labeling uses small sample labels to adjust the image that meets the distance evaluation criteria,avoiding the time-consuming phenomenon of multiple feedback retraining on the optimal hyperplane,and reducing the number of feedbacks through an iterative strategy.Experimental results show that this method can obtain ideal search results with fewer feedback times.