节点文献

基于GPU的人脸定位算法研发与优化

Development of Face Detection and Alignment Algorithm Based on GPU

【作者】 张苗

【导师】 田翔;

【作者基本信息】 浙江大学 , 仪器仪表工程(专业学位), 2016, 硕士

【摘要】 近年来随着人脸识别技术的发展,越来越多计算机视觉技术被应用到人脸研究领域。人脸定位是人脸识别的基础,包含人脸检测和人脸关键点定位两个方面。本文对联合人脸定位算法进行了研发,利用形状索引特征与改进的决策树,同时实现人脸检测与人脸关键点定位,并利用CUDA对该算法进行了GPU并行计算优化。本文主要对联合人脸定位算法进行了研发,完成了基于形状索引特征的分类回归树构建,并利用Adaboost与级联策略完成了分类器训练;算法实现过程中,为了增强形状索引特征的表征能力,本文利用HSV图像与梯度图像扩展了原始灰度图像通道,提取了多通道的形状索引特征;训练过程中,对算法进行了计算负载与并行化可行性分析,利用CUDA对分类回归树的构建过程进行了GPU并行计算优化;检测过程中,对算法流程进行进一步并行可行性分析,实现了针对检测子窗口的并行计算优化。本文利用实现的算法在Helen数据集进行了训练,在AFW(The Annotated Faces in the Wild)数据集进行测试,得到了90%的检测率,包含20个关键点时的平均定位误差在3.92左右。利用CUDA对并行计算优化之后,对于2000个训练样本,1000个特征的分类器训练,训练时间从258s缩短为100s。对于一张VGA图像,检测帧率由2fps提升到34fps,达到了实时应用要求。

【Abstract】 With the development of face recognition technology in recent years, computer vision has been applied to this field deeply. Face localization is the basis of face recognition, containing face detection and face alignment.This thesis focuses on the joint face localization algorithm development, using shape indexed feature and transformed decision tree to do face detection and face alignment at the same time, and GPU parallel computing using CUDA. All the work mainly focuses on three aspects.1) Development of the joint face localization algorithm, including construction of the classification and regression tree by shape indexed feature, and the training of classifiers using Adaboost and cascade. In the algorithm implementation process, in order to enhance the representation of shape indexed feature based on pixel, this thesis uses HSV image and gradient image to expand the original grey image channel and get the shape indexed feature of multiple channels.2) To be more efficient, this study makes the time and parallel feasibility analysis for optimizing the classification and regression tree constructing progress by CUDA on GPU.3) In the detection, this thesis makes further analysis on parallel feasibility and accomplishes the parallel computation optimization on sub-window.This thesis gets impressive results with 90% precision on AFW dataset and a mean residual of 3.92 when the dataset containing 20 key-points. This thesis makes the detection on VGA image runs at 34fps,17 times faster than on CPU in efficiency. For a 2000 training samples with 1000 features, the construction time of classification and regression tree is reduced from 258 seconds to 100 seconds.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2016年 08期
  • 【分类号】TP391.41
  • 【被引频次】7
  • 【下载频次】325
节点文献中: 

本文链接的文献网络图示:

本文的引文网络