节点文献

基于视频时空特征的学生课堂行为检测

Student Classroom Behavior Detection Based on Video Spatio-Temporal Features

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 刘宏张释文孙程聂萱张锦

【Author】 LIU Hong;ZHANG Shiwen;SUN Cheng;NIE Xuan;ZHANG Jin;College of Information Science and Engineering, Hunan Normal University;School of Computer and Communication Engineering, Changsha University of Science & Technology;

【通讯作者】 张锦;

【机构】 湖南师范大学信息科学与工程学院长沙理工大学计算机与通信工程学院

【摘要】 课堂行为能有效反映学生的学习状态,通过深度学习技术实现课堂行为检测对于改进教学方式、提高教学质量具有重要意义。目前,课堂行为检测较多采用基于静态图像的方式进行,往往会忽略行为的动态特征,对连续性行为检测效果较差。为此,提出一种基于视频时空特征的YOLOv7-SlowFast课堂行为检测方法,通过YOLOv7定位学生目标后采用SlowFast检测课堂行为。首先,为提高YOLOv7在人群密集环境下的检测精度,引入自适应空间特征融合模块解决不同尺度特征之间不一致的问题;其次,采用RepGhost轻量化模块改进YOLOv7网络结构,通过结构重新参数化提高模型检测速度;最后,针对SlowFast时空行为检测精度低的问题,设计基于标准化的时间注意力模块,加强模型对时间特征的感知。实验结果表明,改进后YOLOv7在Crowdhuman数据集上平均精度均值(mAP)达86.96%;改进后SlowFast在自制课堂行为检测数据集上的mAP达87.28%,能够实现课堂行为的有效检测。

【Abstract】 Classroom behavior can effectively reflect students′ learning status, and using deep learning technology to detect classroom behavior is of great significance for improving teaching methods and enhancing teaching quality. At present,classroom behavior detection is mostly based on static images,which often ignore the dynamic characteristics of behavior and have poor performance in detecting continuous behavior.To this end, a YOLOv7 SlowFast classroom behavior detection method based on video spatiotemporal features is proposed, which locates student targets through YOLOv7 and uses SlowFast to detect classroom behavior. Firstly, in order to improve the detection accuracy of YOLOv7 in densely populated environments, an adaptive spatial feature fusion module is introduced to solve the problem of inconsistency between features of different scales. Then, the RepGhost lightweight module is used to improve the YOLOv7 network structure, and the model detection speed is improved by reparameterizing the structure. Finally, to address the issue of low spatiotemporal behavior detection accuracy in SlowFast, a standardized time attention module is designed to enhance the model′s perception of temporal features. The experimental results show that the improved YOLOv7 has an average precision mean(mAP) of 86.96% on the Crowdhuman dataset. After improvement, SlowFast achieved an mAP of 87.28% on a self-made classroom behavior detection dataset, which can effectively detect classroom behavior.

【基金】 中央军委装备发展部全军共用信息系统装备预研项目(31511010402)
  • 【文献出处】 软件导刊 ,Software Guide , 编辑部邮箱 ,2024年08期
  • 【分类号】TP391.41
  • 【下载频次】73
节点文献中: