节点文献
图像理解的关键问题和方法研究
Researches for Key Issues and Methods in Image Understanding
【作者】 谢昭;
【导师】 高隽;
【作者基本信息】 合肥工业大学 , 计算机应用技术, 2007, 博士
【摘要】 图像理解是当前计算机研究领域的热点和难点,其根本任务就是让计算机正确解释所感知的图像场景以及场景中的内容,图像理解与计算机视觉、与人工智能有着密切的联系,具有重要的理论研究意义和广阔的应用前景。图像理解具有鲜明的层次性,作为图像理解的低层数据的是视觉信息,理论出发点是计算机视觉,作为图像理解的高层数据是知识信息,理论依据出发点是人工智能。图像理解中视觉数据和人类知识两种类型的信息流贯穿图像理解的整个过程,但是目前对这两种类型的数据和信息流的研究基本上是割裂的,忽略了知识和数据之间的融合,忽略了低层处理和高层分析的联系。本文从数据驱动知识、知识指导数据这一图像理解的核心问题出发,从视觉信息分析处理与知识信息分析处理的结合部入手,着重研究图像理解信息流中数据和知识的表示、存储、分析和转换,研究合适的视觉信息处理载体和知识信息处理方法,实现广义目标检测识别、区域语义理解以及场景分析等图像理解的主要任务,形成新颖的图像理解方法:同时,研究图像理解的结构特性,构建新型的目标空间关系模型和整体场景的分析模型,建立模型之间的约束反馈机制,体现理解的反馈和渐进性,指导先验信息的获取,并作用于低层的视觉数据处理分析,提高理解的速度和准确性,初步形成新型、完整、有效、快速的图像理解原型。本文的主要工作如下:1、研究了图像理解中数据和信息表示的融合,概述了图像理解中常见的信息表示方法,侧重描述新的“知识”和“数据”两种信息的融合和转换手段,体现图像理解中实体的认知关系;研究了图像理解中视觉信息的提取问题,总结了图像理解视觉特征的提取策略,建立了视觉像素的统计概率模型,在模型基础上提出了一种新的目标定位方法,对背景具有一定的抗背景干扰能力,并形成了对特征提取方法的有益补充。2、研究了图像理解中视觉信息的存储与分析,针对图像理解中的图结构模型载体分析问题,总结了图模型中经典的参数估计和概率推理方法以及在视觉分析中的应用,提出了一种基于目标空间关系的无向图结构模型,讨论了新模型中的参数学习问题,推导出迭代公式,进行场景目标分析,形成对图像理解认知载体的丰富和完善。3、研究了图像理解中视觉信息的概念认知划分,针对广义目标检测识别方法问题,提出了基于共享特征的层次Boosting目标检测识别方法,可同时进行多类目标检测和识别,在检测率近似保持不变的情况下,提高了目标的识别率,缩短了分类的搜索时间,体现了图像理解的渐进性,形成了视觉信息向知识信息的转换。4、研究了图像理解中的知识处理和分析,针对图像理解中的区域分析和语义标记问题,提出了基于粗糙集合的区域分割方法和知识库约简方法,对场景中视觉属性较为一致的区域具有较好的分割效果,同时在保持概念分类能力不变的情况下形成了知识的有效约简,一定程度上避免的噪声数据的干扰,提高了语义标记和区域分析的合理性,实现了数据和知识的融合。5、初步研究了场景分类的基本方法,提出的高斯概率统计模型对场景分类具有一定的有效性,同时,验证了场景分类信息对目标分析的指导和约束作用,提高了目标分析的准确度,体现了图像理解中反馈的认知结构。
【Abstract】 Image understanding is the hotspot and difficulty in computer reseach area. The essential task is to interpret the acquired image scene and its contents accurately. It is closely relative with computer vision and artificial intelligence with important theories and wide applications.Image understanding has the distinct layer property. As the visual information in lower layer, the theorical startpoint is computer vision and as the knowledge information in higher one, the theorical basis is artificial intelligence. Visual data and knowledge are two types of information through understanding images, but current researches on them is often separative which neglect the fusions between knowledge and data and ignore the relations between process in lower layer and analysis in higher one.Considering key issues about data-driven knowledge and knowledge-guiding data in image understanding, we start researches for novel methods from joints between these two kinds of information processing. The thesis focuses on representation, storage, analysis and transform with data and knowledge in image understanding to research proper cognitive carriers and knowledge processing methods for several sub-tasks as generic object detection and recognition, regional semantic understanding and scene analysis which forms the novel way. At the same time, we discuss the structures in image understanding and build models for objects with spatial relations and global scenes to represent corresponding restriction and feedback mechanisms, which guide for knowledge accuqusition and act on data processing in lower layer to improve speed and accuracy in image understanding and rm novel complete effective and rapid archetypal structure initially.This thesis includes the following contents:1、On the research of fusion with data and knowledge representation, we describe the general ways of information representation with emphasis on fusion and translation between knowledge and data to reveal cognitive relations in entities. Then we summarize the feature extraction strategies and build the regional statistical models with pixels. Based on them, a new object location method is proposed to keep out the "background" noise and supply the current ways for feature extraction.2、We study the storage and analysis on visual information to solve the graphic models as carriers in image understanding. We ummarize the theories for parameter estimation and probability inference with corresponding visual. Then we present an undirected graphic model based on spatial relations, discuss two main problems above and obtain the iterative equations to analyze object and scene for enrichment in image understanding.3、We discuss the cognitive division in visual information for generic object detection and recognition and propose the layer joint boosting algorithms based on sharing features. With the condition of approximate unchanged detection rate, the recognition rate increases and classification time decreases dramatically to show gradualness in image understanding and transform from visual data to knowledge.4、We research the knowledge processing and analysis in image understanding to solve the problems in regional analysis and semantics labeling. We present the new image segmentation and knowledge base reduction methods with rough set theories. The result demonstrates the better segmentation performance on visual consistent area and effect reduction without changes in conception classifications to avoid interference with noisy data and improve reasonability in labeling semantics and analyzing regions to some extend realizing the fusion with data and knowledge.5、We analyze the basic method for scene classification primarily and propose the new method based on Gaussian probabilistic statistical models for effect results. At the same time, we also validate the classification results as prior knowledge have strong guidance and restrict to improve accuracy in object analysis and reveal feedback in image understanding.