节点文献

基于点云补全的三维遮挡目标检测方法研究

Research on 3D Occluded Object Detection Method Based on Point Cloud Completion

【作者】 余超

【导师】 周静;

【作者基本信息】 江汉大学 , 计算机技术, 2023, 硕士

【摘要】 基于激光雷达点云的三维目标检测是自动驾驶感知系统的核心技术之一。然而,激光雷达存在遮挡和信号丢失等问题,这些问题可能会导致点云形状不全,从而影响检测精度。为此,本文借鉴现有点云补全算法的思路,并针对补全数据集类别与KITTI数据集不对应的问题,构建基于残差预测的遮挡目标补全子网络,并将其融合至三维目标检测网络中。同时,本文对该目标检测网络进行优化,以进一步提升检测精度。本文的研究内容主要包含以下三个方面:(1)针对现有点云补全网络补全遮挡目标时存在补全类别受限、耗时较多等问题,本文提出了一种基于残差预测的遮挡目标补全子网络。与现有点云补全网络不同,该子网络不直接预测点的坐标,而是预测体素特征与体素中心的残差。且该子网络不需要额外引入补全数据集进行网络训练,而是提取KITTI数据集中目标点云并根据相应算法进行匹配融合,以获取补全标签。实验表明,该子网络不仅能够有效地恢复遮挡目标形状缺失的部分,而且不会过多增加网络的推理时间。(2)针对三维目标检测中存在的遮挡问题,本文提出了一种基于点云补全的遮挡目标检测网络。本文以Voxel RCNN为基准模型,并融合了遮挡目标补全子网络以恢复遮挡目标的完整形状。本文在KITTI数据集上进行了实验,对于汽车类别中等难度的目标,检测精度提高了0.84%。此外,本文对不同遮挡级别的目标进行了评估测试,对于遮挡程度为级别1和级别2汽车类别目标,检测精度分别提升了0.46%和0.18%。实验结果表明,本文提出的算法能够有效提高遮挡目标的检测精度。(3)针对补全后的体素前背景比例发生改变的问题,本文对上述网络进行优化。首先,本文采用动态体素化替换基准模型的体素化方法,并在主干网络中引入了注意力模块,提升网络对目标的关注度。其次,本文摒弃锚框,采用基于中心的候选框生成策略,并在细化模块中进一步利用补全子网络的输出。本文在KITTI数据集上进行实验,结果表明,汽车类别简单和中等难度的平均精度分别提升了0.11%和0.99%,三种难度的平均精度分别为89.52、85.51和78.87;汽车类别三个遮挡级别的平均精度分别提升了0.19%、0.85%、0.23%,平均精度分别为87.50、84.37和78.73。综上,本文提出算法能够对遮挡目标进行形状补全,从而增加目标检测网络在雨雪、雾天等恶劣天气下的鲁棒性,提升自动驾驶系统的稳定性。

【Abstract】 The 3D object detection based on point clouds is a critical component in autonomous driving perception systems.However,occlusion and signal loss in lidar can result in incomplete point cloud shapes,leading to compromised detection accuracy.To address this issue,this study leverages existing point cloud completion algorithms and develops an occlusion object completion network based on the KITTI dataset.We integrate this network into the 3D object detection network and propose enhancements to further improve detection accuracy.The study focuses on three aspects: the occlusion object completion network,integration with the 3D object detection network,and enhancements to improve detection accuracy.The research content of this study focuses on three aspects:(1)In response to the limitations of existing completion algorithms,such as limited categories and high computational time,this study proposes an occlusion object completion network based on the residual prediction.Unlike previous networks,the subnet predicts the residual between voxel features and voxel centers instead of directly predicting the coordinates of points.Additionally,our approach does not require additional complementary datasets for network training;instead,we extract the object point cloud from the KITTI dataset and perform matching fusion algorithm to obtain complementary labels.Experimental results demonstrate that the approach can effectively recover missing parts of occluded object shapes without excessively increasing the inference time of the network.(2)Aiming at the occlusion problem in 3D object detection,this paper proposes a 3D occluded object detection network based on point cloud completion.In this paper,Voxel RCNN is used as the benchmark model,and the occlusion object completion module is built to integrate the occlusion object completion network proposed above into the benchmark model to restore the complete shape of the occlusion objects.The network proposed in this paper is evaluated on the KITTI dataset,and the average accuracy of the moderate difficulty of the Car is increased by 0.84%.Moreover,evaluations are performed on objects with varying occlusion levels.Results reveal that the proposed algorithm improves the detection accuracy by 0.46% and 0.18% for Car with occlusion levels of 1 and 2,respectively.These results demonstrate the effectiveness of the proposed algorithm in enhancing the detection accuracy of occluded objects.(3)In order to address the issue of changing foreground-background ratio in the completed voxels,this study proposes optimizations to the aforementioned network.Firstly,dynamic voxelization is adopted to replace the voxelization method in the baseline model,and an attention module is introduced in the backbone network to enhance the network’s focus on the target objects.Secondly,anchor boxes are abandoned,and a center-based candidate box generation strategy is employed.Furthermore,the outputs of the completion subnetwork are further utilized in the refinement module.Experimental evaluations are conducted on the KITTI dataset,and the results demonstrate improvements in the average precision for the simple and medium difficulty levels of the vehicle category by 0.11% and 0.99% respectively,with average precision values of 89.52,85.51,and 78.87 for the three difficulty levels.For the three occlusion levels in the vehicle category,the average precision is improved by 0.19%,0.85%,and 0.23% respectively,with average precision values of 87.50,84.37,and 78.73.In bad weather such as rain,snow and fog,the problem of lidar occlusion will become more serious.The algorithm proposed in this paper can complete the shape of the occluded target,increase the robustness of the network,and improve the stability of the autonomous driving perception systems.

  • 【网络出版投稿人】 江汉大学
  • 【网络出版年期】2024年 04期
  • 【分类号】TP391.41
节点文献中: 

本文链接的文献网络图示:

本文的引文网络