节点文献

SFW-YOLOv8复杂场景视频车辆检测模型

SFW-YOLOv8 Complex Scene Video Vehicle Detection Model

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 祝琴韩沈阳曾明如赖平红吴垂茂胡玮轶

【Author】 Zhu Qin;Han Shenyang;Zeng Mingru;Lai Pinghong;Wu Chuimao;Hu Weiyi;School of Public Policy and Management, Nanchang University;School of Information Engineering, Nanchang University;Jiangxi Provincial People’s Hospital;

【通讯作者】 曾明如;

【机构】 南昌大学公共政策与管理学院南昌大学信息工程学院江西省人民医院

【摘要】 针对复杂交通监控场景中视频车辆检测模型难以提取丰富的目标特征的问题,本文从充分利用视频图像时空特征信息的角度,新建时空特征融合模块SF-Module,运用Transformer模型中的多头自注意力机制实现视频车辆图像当前帧和历史帧时空特征信息的提取和融合,丰富目标的特征信息;在此基础上,基于YOLOv8网络,在其颈部网络融合新建的时空特征融合模块SF-Module,挖掘视频图像序列的时空特征信息;同时,引入WIoU损失函数作为预测框回归损失,减少低质量标注框产生的有害梯度,设计SFW-YOLOv8视频车辆检测模型。最后,新建的SFW-YOLOv8复杂场景视频车辆检测模型在UA-DETRAC数据集上进行实验,对数据集中的部分图片进行了模拟雨天和雾天的数据增强,提高车辆检测模型的泛化性。实验结果表明,SFW-YOLOv8视频车辆检测模型的MAP50和MAP50:5:95值为79.1%和63.6%,较YOLOv8模型分别提高了1.7%和3.3%,推理速度为11 ms/帧,具有较为优秀的检测性能。

【Abstract】 For the problem that it is difficult for video vehicle detection models to extract rich target features in complex traffic monitoring scenarios, in this paper a new spatial-temporal feature fusion module SF-Module is established from the perspective of making full use of spatial-temporal feature information of video images. The multi-head self-attention mechanism in Transformer model is used to extract and fuse the temporal and spatial feature information of current and historical frames of video vehicle images to enrich the feature information of the target. On this basis, based on YOLOv8 network, the newly created spatio-temporal feature fusion module SF-Module is integrated in its neck network to mine spatio-temporal feature information of video image sequences. At the same time, the WIoU loss function is introduced as the prediction frame regression loss to reduce the harmful gradient generated by the low quality label frame, and the SFW-YOLOv8 video vehicle detection model is designed. Finally, the newly established SFW-YOLOv8 complex scene video vehicle detection model is tested on the UA-DETRAC dataset, and some images in the dataset are simulated to enhance the data on rainy and foggy days, so as to improve the generalization of the vehicle detection model. The experimental results show that the values of mAP50 and mAP50:5:95 of the SFW-YOLOv8 video vehicle detection model are 79.1% and 63.6%, which are 1.7% and 3.3%higher than that of the YOLOv8 model, respectively. The reasoning speed is 11 ms/frame, which has excellent detection performance.

【基金】 国家自然科学基金(72164027)资助
  • 【文献出处】 汽车工程 ,Automotive Engineering , 编辑部邮箱 ,2024年12期
  • 【分类号】TP391.41;TP183;U495
  • 【下载频次】295
节点文献中: 

本文链接的文献网络图示:

本文的引文网络