节点文献

基于条件随机场的目标提取

Conditional Random Field Based Object Extraction

【作者】 张晓峰

【导师】 吕岳;

【作者基本信息】 华东师范大学 , 计算机应用技术, 2012, 博士

【摘要】 目标提取是将感兴趣的目标从背景中分离出来的过程,是计算机视觉的重要组成部分,是图像理解与识别的关键步骤。由于目标本身的复杂性和背景的多变性等因素,准确地提取目标是一项充满挑战性的任务。组成目标的各个部分之间有密切的联系,使用当前位置和周围区域的联系能够有效减少图像的不确定性和模糊性给目标提取带来的负面影响,因此如何利用上下文信息成为目标提取的研究热点。条件随机场(Conditional Random Field, CRF)是在马尔可夫随机场(Markov Random Field, MRF)的基础上发展起来的,它不仅可以利用相邻节点的联系,还能够利用整个观测场的信息对局部判断加以指导,从而更加合理地提取口标。本文从两个方而研究基于CRF的目标提取方法,一方而分析目标性质,提出适合口标提取的特征,另一方面改进基于CRF的口标提取框架,使之更充分地利用目标间的联系。本文的主要研究成果包含以下几点:提出一种快速的CRF模型推断方法。模型推断是使用训练过的CRF模型获取图像最优目标标记的过程。随着图像规模的增大,模型推断消耗的时间急剧增加。首先使用低分辨率图像推断,由于像素数目少,收敛时间有效缩短,但是提取的目标比较粗糙;然后以低分辨率图像的模型推断结果为基础,在原始分辨率图像相应的边缘区域再次进行模型推断,从而获得比较精细的目标提取效果。算法在不明显降低目标提取精度的前提下,有效缩短了CRF模型的推断时间。提出一种融合不同尺度和方向的轮廓片段的CRF目标提取方法。边缘轮廓是最容易区分目标和背景的特征之一,将轮廓分解成多个片段可以更好地适应形变,将轮廓特征拓展到多种不同尺度能够适应不同大小目标的检测。匹配中,使用了铰链角度、轮廓方向以及偏Hausdorff距离选择候选轮廓位置。CRF将不同尺度、方向的候选轮廓片段有机结合在一起,有效地利用了片段之间的联系选择了最终轮廓。提出一种基于全局特征CRF的自然场景文字提取方法。利用边缘滤波结合开关映射提取出候选文字区域,改善了低对比度区域、噪声区域的候选提取效果。由于文字的大小、颜色、纹理变化较大,使用局部特征并不能很好地表示文字,本文使用当前节点和邻域内节点相似性作为全局特征。CRF将这些全局特征联系起来,并有效地提取出文字区域。提出一种基于两层CRF的文档图像文字提取方法。文字区域使用Gabor实部和虚部滤波时能够得到较强的滤波结果。将图像分割成大小相同的网格,取每个网格邻域的滤波结果的直方图作为特征,使用CRF分辨文字和背景区域。为了优化分辨结果,提出一种两层的CRF模型,将两类特征的分类结果融合,进一步提高了文字区域提取的准确性。提出一种CRY与支持向量机结合的手写字符提取方法。首先,提出一种基于开关映射的双阈值二值化方法,用来提取非均匀光照文档图像中的字符。接着,将整幅图像分割成大小相同的网格,避免直接处理手写字符和印刷字符粘连的情况。从每个网格的邻域中提取边缘共生矩阵作为特征,由于相邻网格特征的相似性,使用了CRF的分类框架将网格分成手写体和印刷体两类,在使用CRF的分类框架时,结合了支持向量机,使分类结果更加合理。最后,利用文本行信息的后处理获得更精细、意义更明确的分类结果。

【Abstract】 Object extraction is the process of separating interested targets from background. It is an important part of the computer vision and a key step of image understanding and recognition. Accurate object extraction is still a challenging task due to the complexity of object and variability of background. Relations between the current location and its surrounding area can effectively reduce the negative impacts of the uncertainty and ambiguity of the image on object extraction. Hence, how to use the context information becomes a popular issue of object extraction.Conditional Random Fields (CRF), developed from Markov Random Field (MRF), can instruct the local judgments by not only the connections of the adjacent regions but also the information of the whole observation field to extract object regions more reasonably. This paper focuses on the CRF-based object extraction methods from two aspects. On the one hand, the suitable features for extraction are extracted by analyzing the characteristics of objects. On the other hand, the CRF-based object extraction framework is modified to take full advantage of the relationship among the objects. The main contributions of this paper are as follows:A fast CRF model inference method is proposed. Model inference is the process of obtaining the optimal target label by using the trained CRF model. The inference time increases dramatically as the increase of image size. First, the proposed method infers on a low resolution image, which effectively reduces the convergence time due to the small number of pixels. However, the extracted target is rather coarse. Then, the model inference is applied again on the edge region of the object, based on the inference result on low resolution image, to obtain a relatively fine target. The proposed method effectively shortens the CRF model inference time in case of not reducing the accuracy of target extraction.An object extraction method which uses CRF to combine different scales and directions of contour fragments is proposed. The contour is one of the most obvious features to distinguish between the targets and background. Decomposing the contour into fragments will be more insensitive to the deformation. The contour feature is expanded to a variety of scales to detect objects with different sizes. Then, the location of the candidate fragments are detected by the hinge angle, the contour direction, as well as partial Hausdorff distance. CRF combines the candidate fragments with different scales and directions and efficiently uses their relations to select the final contour.A natural scene text extraction method based on CRF with global feature is proposed. The candidate character regions are extracted by Toggle Mapping combined with an edge filter, which can extract the candidate regions with low contrast or noise more efficiently. Local features such as size, color and texture are not suitable to represent the text due to their variances, while the similarities between the current node and its neighboring nodes are more stable to be global features. These global features are combined by CRF to effectively extract the text regions.A text extraction method for document image based on a two-layer CRF is proposed. With the real or imaginary part of Gabor, text regions have strong filtering results which make the text regions be different from background. The image is divided into grids with same size and CRF is used to distinguish the text from background with the histogram taken from the filtering results of the neighborhood of each grid. In order to optimize the extraction results, a two-layer CRF model is applied to compromise the classification results of the two characteristics and further improve the accuracy of the text extraction.A handwritten character extraction method, which combines CRF with Support Vector Machine, is proposed. First, a dual-threshold binarization method based on Toggle Mapping is introduced to extract the characters in the document image with uneven illumination. The input image is divided into grids with same size to avoid the case of direct handling the adhesion of handwritten and printed characters. Then, the characteristic Edge Co-occurrence Matrix is extracted from the neighborhood of each grid. Due to the similarities of adjacent grids, CRF classification framework is employed to divide the grids into the categories of the handwritten and the printed. CRF classification framework combined with Support Vector Machine can obtain more reasonable classification results. Finally, these results, using the post-process of text line information, will be more precise and meaningful.

  • 【分类号】TP391.41
  • 【被引频次】11
  • 【下载频次】1669
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络