节点文献
知识驱动与数据驱动融合的遥感影像典型地物智能识别研究
Knowledge and Data Driven Intelligent Recognition Model for Typical Ground Objects from Remote Sensing Images
【作者】 王莹;
【导师】 彭岳星;
【作者基本信息】 北京邮电大学 , 信息与通信工程, 2022, 硕士
【摘要】 随着遥感技术的快速发展,利用高分辨率遥感影像进行地物识别在城市规划、环境监测等领域有着重要的应用价值。遥感影像地物识别主要包括了地物分割和变化检测两个方面,而由于地物特征复杂多样,人工解译、图像处理、机器学习等传统方法无法满足高精度的地物识别研究,且基于人工经验设计,高度依赖于专业知识和数据特征,难以适应大规模工程应用所要求的智能化。近年来,人工智能的发展为遥感影像提供了新的研究手段,以卷积神经网络为代表的深度学习在计算机视觉方面取得了卓越的成绩,其强大的特征学习能力使其在高分辨率遥感影像地物识别研究中得到了广泛的应用。本文聚焦于典型地物的智能识别,重点研究基于卷积神经网络的道路提取及典型地物语义变化检测方法,将地物知识和遥感数据相结合进行模型设计,旨在提升地物识别准确性与可靠性,主要研究工作及创新点如下:1、为提高在多尺寸道路共存、背景色彩及纹理与道路相似、植被遮挡等复杂环境下小尺寸道路提取的可靠性,在U-Net的基础上,针对道路的多尺度与形态特征,提出了一种融合空洞卷积和注意力机制的双解码器结构的语义分割网络。该模型通过空洞卷积级连并行的方式捕获多尺度上下文信息,使用全局平均池化分支融合全局特征,引入通道-空间双注意力机制提高特征的表征能力,并利用双解码器结构加强细节信息的还原。在Massachusetts道路公开数据集上进行性能测试,实验结果表明:在道路和背景色彩相似以及受到植被阴影遮挡等非理想情况,相比主流的道路提取模型,提出的模型有效地提升道路提取精度和连通性。消融实验和特征可视化实验验证了设计的空洞卷积-注意力模块和双解码器结构能有效提升模型捕获全局上下文信息、提取细节特征的性能。2、为了快速准确地定位遥感影像的变化区域并判断变化前后的地物类型,提出了一种基于多任务学习和注意力机制的孪生网络结构的语义变化检测模型。该模型将语义变化检测问题视为二分类变化检测任务和多分类地物分割任务的结合,利用孪生网络提取双时相图像特征,并将提取的特征在两个任务中共享,以节省训练时间和资源,同时将变化检测任务的预测作为地物分割任务的辅助输入,和双时相图像特征在特征级进行融合,从而保留更多上下文信息,并在此基础上引入注意力机制,提高模型的泛化能力和鲁棒性。在Land-CD公开数据集上的消融实验和对比实验结果表明,该模型能够提高语义变化检测的精度,并有效提升小尺度地物变化检测和分割的准确率。
【Abstract】 With the rapid development of remote sensing technology,extracting features and recognizing ground objects from high-resolution remote sensing images(HRSIs)are of important application significance in many areas such as urban planning and environmental monitoring.In this thesis,intelligent ground object recognition methods focus on object segmentation and change detection.Due to the complexity and diversity of ground object features,traditional methods like manual visual interpretation and feature engineering-aided machine learning methods,cannot meet the high requirements on intelligent,accurate and reliable recognition.In addition,the traditional methods heavily depend on professional knowledge,resulting in weak generalization capability.Recently,the explosive development of deep learning provides new intelligent methods for remote sensing image processing,and convolutional neural network(CNN)has achieved state of the art(SOTA)in computer vision and thus been widely employed in ground object recognition from HRSIs.In this thesis,hybrid knowledge-and data-driven semantic segmentation-based deep learning methods are studied to enable accurate ground object recognition,especially on the tasks of road extraction and semantic change detection.The main contributions of this thesis include:1.To enhance the recognition of small-size roads in complex environments,where vegetation shadow and many kinds of environmental interference with similar texture and shape exist,a semantic segmentation model termed DDU-Net is developed,which holds dual decoder structure and integrates dilated convolution and attention mechanism.The proposed DDU-Net model captures multi-scale context information by a dilated convolutions cascading parallel.Also,the global average pooling branch is added to fuse the global information,and the channel-spatial dual attention mechanism is introduced to improve the characterization ability of features.Furthermore,the dual decoder structure can enhance the restoration of detail information.Sufficient experiments are carried out on the open Massachusetts road dataset,and the experimental results show that the DDU-Net model can greatly improve the robustness of road extraction when roads either have similar color to the background or are shaded by vegetation shadow.Compared with mainstream baseline models,DDUNet can effectively improve the accuracy of road extraction and the connectivity of segmentation.Moreover,ablation experiments and feature visualization show that the introduced dilated convolution attention module and dual decoder structure can effectively capture global context information and extract detailed features at the same time.2.In order to locate the change areas of a remote sensing image accurately and identify the categories of ground objects reliably before and after the change,under the principle of multi-task learning,a semantic change detection model with siamese network structure and attention mechanism is developed.In this model,the semantic change detection problem is translated into the combination of binary-classification change detection task and multi-classification ground object segmentation task.Siamese network structure is employed to extract the features of bitemporal images,which are shared by the two tasks to reduce training complexity.The prediction output of the change detection task feeds the ground object segmentation task,and then fuses with the bi-temporal image features at the feature level so as to retain more context information.Attention mechanism is introduced to promote the generalization and robustness of the model.Both ablation and comparative experimental results on the land change dataset show that the proposed model can improve the accuracy of semantic change detection,especially for smallscale objects.
【Key words】 high-resolution remote sensing image; road extraction; semantic change detection; semantic segmentation; multi-task learning;
- 【网络出版投稿人】 北京邮电大学 【网络出版年期】2024年 01期
- 【分类号】TP751