节点文献

基于深度学习与张量分解的医学图像分割算法研究

Research on Medical Image Segmentation Based on Deep Learning and Tensor Decomposition

【作者】 王静

【导师】 刘琚; 吴强;

【作者基本信息】 山东大学 , 先进制造(电子信息)(专业学位), 2023, 博士

【摘要】 智能医疗通过大数据、人工智能、高性能计算等技术,应用临床实验数据,面向临床应用进行多学科交叉和深度融合。医学图像分割是实现智能医疗的基础,在病灶区域定位与识别和制订手术计划等方面起着非常重要的临床意义。临床上对医学图像目标区域手工标注是一件耗时耗力且主观性强的工作,给临床医生手工分割带来了很大挑战。因此,如何自动获得精确、鲁棒的医学图像分割结果是亟待研究的问题。然而,医学图像不同于自然图像,数据量少,同一领域的图像之间具有高度的结构相似性,但在细节方面却有很大的差异性。分割任务中目标区域多变,处理过程相对于自然图像是更加复杂,容易受病变影响,并且伴随着伪影、噪声、对比度低、边缘模糊等特点,还涉及设备精度、病患身体是否配合等非图像方面可控的因素。这些均对医学图像自动精确分割和临床应用提出挑战,因而也成为研究热点。随着深度学习技术的发展,基于卷积神经网络的全自动分割算法目前在医学图像分割领域表现优异。然而,其缺乏对目标全局信息的表达。同时,肿瘤及组织器官类别不平衡和在多模态图像上空间与时间分辨率不同,从而增加了目标对象的复杂多变性。加之,大部分的深度学习模型存在参数量过多、复杂度过高等问题,这些均对医学图像自动精确分割提出了严峻的考验。为解决这些问题,本论文通过融合张量分解与深度学习这两种技术的优势,研究精确、鲁棒、高效的轻量化医学图像分割算法。本论文研究工作及主要进展包括以下四个方面:1)针对卷积神经网络感受野有限很难有效捕捉全局信息的问题,利用Transformer可以捕获长距离依赖关系的特点,提出了一种交叉卷积Transformer(Cross-convolution Transformer,C2former)算法,旨在解决卷积神经网络感受野有限、难以捕捉全局信息的问题。该框架结合了卷积神经网络和Transformer结构,以适应医学图像分割任务。通过在短距离依赖关系上加入具有卷积特性的注意机制,从空间、通道两个层面捕获局部特征。在长距离依赖关系上使用采样自注意,以此捕获不同像素之间的全局特征。利用窗口自注意机制,将长距离和短距离的依赖关系整合再做特征提取,以此增强模型对于图像的局部-全局特征理解。针对三个公开的医学图像数据集进行的实验表明,该方法可以显著提高医学图像分割的精度。2)针对C2former作为一种2D分割方法在处理3D图像时存在的深度信息丢失问题,提出了一种卷积神经网络级联Transformer结构的3D医学图像分割算法。该算法使用了标准Transformer结构中的多头注意机制,通过交叉共享的方式增加了不同头之间的关联性,获取了更多的特征信息,提高了模型的泛化性和准确性。此外,该算法还引入了多层融合模块,在每个层之后进行特征融合,以适应分割目标的多变性特点。为了减小注意力机制的参数量,利用基于Tucker分解的模型压缩技术构建了一个新的注意力机制模型。该算法在四个不同的医学图像分割数据集上进行了评估,包括公开数据集和来自临床的数据集,覆盖多个成像模式。结果表明,本章算法在心脏器官分割、脑肿瘤分割、腹部器官分割、子宫及肿瘤分割方面都取得了较好的分割效果,而且新的注意力机制模型在压缩参数的同时能够保证分割性能相当。3)针对卷积神经网络级联Transformer结构的3D医学图像分割模型存在细节丢失的问题,提出了一种将卷积神经网络和Transformer并行结合的算法。该算法采用共享卷积来减少不必要的参数数量,并允许局部特征和全局特征之间的交互。针对传统的通道注意力机制只是利用的全局平均池化和最低频基底乘积这一特点。本节考虑使用3D-DCT代替全局平均池化,构建一种新的全局注意机制作为跳跃连接全局特征提取模块,用于提取全局特征。为了解决并行结构所带来的额外参数量的问题,采用了张量环分解来对卷积神经网络和Transformer结构分别进行压缩。在压缩过程中,采用自动计算压缩超参数的方式获取压缩参数,经过验证,所提压缩模块能够在压缩参数量的同时,取得和未压缩时相当或更好的结果。实验中采用了四个数据集来对算法进行评估,其中包括三个基准数据集和一个临床数据集,大量实验证明,此算法在不同数据集上均取得较好结果。定量分析表明,所提方法的统计分析结果与临床专家一致。4)对多模态医学图像分割系统的相关临床应用需求进行分析研究,将论文提出的轻量化并行方法进行实现,研发了多模态医学图像分割临床应用系统。整个分割系统分为预处理模块、训练模块、分割及后处理等模块。预处理模块输入原始医学数据,输出可以被分割模块识别的多模态图像格式。分割模块输入预处理后的医学图像数据,输出相应的组织器官或病变分割结果。后处理模块输入分割出来的器官或肿瘤图像,优化分割结果。此外,在实际临床应用中,针对特定疾病或医学任务的需求,可能需要使用自定义的医学影像数据集来训练特定的分割模型。该系统提供一个交互式的用户界面,让用户方便地上传和管理自己的数据集,并对数据进行分割任务的指定,支持用户上传并使用自己的医学影像数据集来训练特定的分割模型。

【Abstract】 In recent years,there has been a global rise in intelligent healthcare,which requires the use of big data,artificial intelligence,and high-performance computing as technical foundations.Clinical trial data is applied for interdisciplinary deep cross-fusion development based on clinical demands.Medical image segmentation is the basis for intelligent healthcare,playing a crucial clinical role in lesion area localization and recognition,as well as developing surgical plans.However,manual annotation of medical image target areas is a time-consuming and subjective task in clinical practice,posing significant challenges to clinical doctors performing manual segmentation.Therefore,it is of great significance to automatically obtain accurate and robust medical image segmentation results.However,medical images are different from natural images.They have small data volumes,high structural similarity within the same field,and diversity in detail.The target area in segmentation tasks is variable,and the processing is more complex compared to natural images,easily influenced by lesions,and accompanied by artifacts,noise,low contrast,edge blurring,and other characteristics.Additionally,medical image segmentation involves uncontrollable factors outside the image,such as equipment accuracy and patient cooperation.With the development of deep learning technology,automatic segmentation algorithms based on convolutional neural networks have shown excellent performance in the field of medical image segmentation.However,they lack expression of global information about the target.In addition,imbalanced tumor and organ categories and differences in spatial and temporal resolution on multi-modal images increase the complexity and variability of the target objects.Furthermore,most deep learning models suffer from problems such as excessive parameterization and high complexity.All of these pose challenges to automatic and accurate segmentation of medical images.To address these issues,this paper proposes a precise,robust,and efficient lightweight segmentation algorithm by combining the advantages of tensor decomposition and deep learning.The main contributions and innovations of this paper are focused on four aspects.1)A new encoder-decoder framework that cross-convolution Transformer(C2former)algorithm is put forward to solve the problem that the convolutional neural network receptive field is limited and difficult to capture the global information.The framework is designed specifically for medical image segmentation tasks.It uses an attention mechanism with convolutional characteristics to capture local features in both spatial and channel dimensions,and a self-attention mechanism to capture global features that capture long-range dependencies between different pixels.The model integrates the dependencies at both short and long ranges using window-based self-attention,enhancing the model’s ability to understand local-global features of the image.The experiments conducted on three publicly available medical image datasets show that the proposed method significantly improves the accuracy of medical image segmentation.2)A 3D medical image segmentation model that addresses the limitations of 2D segmentation methods is proposed in processing 3D images.The proposed model combines convolutional neural networks with a cascade Transformer structure.It uses the standard Transformer structure’s multi-head attention mechanism but increases the relationship between heads by cross-sharing to obtain more feature information and improve the model’s generalization ability.In addition,the model introduces multi-layer fusion modules to perform feature fusion after each layer to adapt to the segmentation target’s diverse characteristics.To reduce the attention mechanism’s parameter volume,the model uses a Tucker decomposition to construct a new attention mechanism expression.The model is evaluated on four different medical image segmentation datasets,including publicly available datasets and datasets from clinical settings,covering multiple imaging modes.The results show that the proposed method achieves good segmentation results in heart organ segmentation,brain tumor segmentation,abdominal organ segmentation,and uterus and tumor segmentation.Moreover,the compression module can ensure comparable segmentation performance while compressing parameters.3)This paper proposes a parallel algorithm that combines convolutional neural networks and Transformers to address the problem of detail loss in 3D medical image segmentation models based on the cascaded CNN-Transformer structure.The algorithm uses shared convolution to reduce unnecessary parameter numbers and enables interaction between local and global features.To address the limitations of traditional channel attention mechanisms that only use global average pooling and the lowest frequency basis product,this paper proposes a new global attention mechanism that uses 3D-DCT instead of global average pooling as a skip-connection global feature extraction module for extracting global features.To solve the problem of additional parameters brought by the parallel structure,tensor ring decomposition is used to compress the CNN and Transformer structures separately.In the compression process,the compression parameters are obtained by automatically calculating the compression hyperparameters.The results show that the proposed compression module can achieve comparable or better performance while reducing the parameter amount.The algorithm is evaluated on four datasets,including three benchmark datasets and one clinical dataset.The results show that the proposed algorithm achieves good segmentation results on different datasets.Quantitative analysis indicates that the proposed method’s statistical analysis results are consistent with clinical experts’ results.Therefore,this algorithm has broad application prospects and can provide an effective solution for medical image segmentation tasks.4)This paper analyzes and studies the clinical application requirements of multi-modal medical image segmentation systems,and implements the proposed lightweight parallel method to develop a clinical application system for multi-modal medical image segmentation.The entire segmentation system is divided into several modules,including the preprocessing module,training module,segmentation module,and post-processing module.The preprocessing module takes the original medical data as input and outputs multi-modal image formats that can be recognized by the segmentation module.The segmentation module takes the preprocessed medical image data as input and outputs corresponding segmented results of organs or lesions.The post-processing module optimizes the segmentation results by taking the segmented organ or tumor images as input.In addition,for specific diseases or medical tasks in practical clinical applications,it may be necessary to use customized medical image datasets to train specific segmentation models.This system provides an interactive user interface that allows users to easily upload and manage their own datasets,specify segmentation tasks for the data,and upload and use their own medical image datasets to train specific segmentation models.

  • 【网络出版投稿人】 山东大学
  • 【网络出版年期】2024年 03期
  • 【分类号】TP18;TP391.41;R319
节点文献中: 

本文链接的文献网络图示:

本文的引文网络