节点文献
面向智能数据处理的图形模式研究
Research on the Graphical Models in Intelligence Data Processing
【作者】 王双成;
【导师】 苑森淼;
【作者基本信息】 吉林大学 , 通信与信息系统, 2004, 博士
【摘要】 图形模式是概率理论和图形理论的结合,是随机变量之间依赖关系的图形表示。在图形中的结点表示随机变量,边(有向或无向)的存在性表示随机变量之间的条件独立性。它具有形象直观的知识表示形式,以及更接近人思维特征的推理方式,被广泛用于专家系统、决策分析、模式识别,机器学习和数据采掘等领域,是近些年国内外智能数据处理的研究热点之一。图形模式由两部分构成,一部分是结构(图形),另一部分是参数(条件或边缘概率分布),分别用于定性与定量描述随机变量之间的依赖关系。图形模式研究的内容较多,其核心部分是贝叶斯网络(有向无环图),马尔科夫网络(无向图)和链图(有向和无向混合图)。本文主要研究贝叶斯网络和马尔科夫网络,并对链图作简要介绍。侧重研究图形模式在智能数据处理方面的应用,即如何转化数据为知识(图形模式学习)和知识转化为智能(基于图形模式的推理)。具体研究内容如下:1.具有完整数据和离散变量的图形模式学习对有代表性的方法和算法进行概述和分析。分别建立基于依赖分析思想和因果语义定向的贝叶斯网络结构学习方法,以及基于变量之间基本依赖关系、基本结构和依赖分析思想的贝叶斯网络和马尔科夫网络结构学习方法。这两种方法均能避免现有的打分-搜索方法的指数复杂性和局部最优结构问题,以及依赖分析方法中的大量高阶条件概率计算和边定向的局限性等问题。同时介绍了两种贝叶斯网络学习算法准确性评价方法。2.具有不完整数据和离散变量的图形模式学习由于具有不完整数据(或丢失数据)的现象普遍存在,而且由于丢失数据的存在无法直接进行图形模式学习,因此具有丢失数据的图形模式学习一直是一个被关注的重要而困难的研究课题。目前主要结合EM算法(或基于梯度的<WP=153>优化方法)和打分-搜索方法进行具有丢失数据的图形模式学习,效率低,而且易于陷入局部最优结构。本文提出了新的具有丢失数据的图形模式学习方法。该方法结合图形模式和Gibbs sampling,通过对随机初始化丢失数据的迭代修正与图形模式的优化调整进行具有丢失数据的图形模式迭代学习。由于Gibbs sampling过程收敛到全局平稳分布,因此可避免使用EM算法(或基于梯度的优化方法)所带来的局部最优和欺骗收敛问题。在每一次迭代中,基于图形模式分解联合概率能够显著提高抽样效率,通过图形模式的优化调整,使迭代过程中的图形模式逐渐接近于平稳分布的图形模式,直到满足终止条件结束迭代。本文研究了具有不完整数据的三种情况:(1)随机丢失数据情况。每一列含有部分随机丢失的数据,具有变量的维数(取值范围)信息和部分例子信息;(2)隐藏变量(或聚类变量)的丢失数据情况。隐藏变量(或聚类变量)列的数据完全丢失,不具有隐藏变量(或聚类变量)的维数信息和例子信息;(3)小样本集的丢失数据情况。大量的行数据完全丢失(没有观察到),具有所有变量的维数信息和部分例子信息。在对这三种情况现有的方法和算法进行分析的基础上,针对存在的一些问题分别建立了新的方法和算法,并进行了必要的理论论证和对比试验分析。具有连续变量的图形模式学习也可转化为不完整数据问题,其学习也是一个迭代过程。在迭代过程中,本文使用混合数据聚类方法离散化连续变量,在新的离散变量的基础上对图形模式进行优化调整,直到收敛。3.图形模式渐进学习同化和顺应是人类学习新知识的两个基本机制,人类的学习过程可以看作是对新知识的不断同化和顺应的过程。本文基于人类学习新知识的基本机制和图形模式的结构和参数变化的不同步性,建立一种新的图形模式渐进学习方法。该方法首先进行图形模式的原结构与数据集的适应性检验,以决定是否进行结构调整。如果需要,则对结构进行适应性调整,并在新结构的基础上进行参数调整,否则只在原结构的基础上进行参数调整,以获得新的图形模式。这一学习过程符合人类学习新知识的基本机制,并能够有效地刻画图形模式结构和参数的动态变化,不需要现有方法中的平稳性和马尔科夫性两个假设。4.图形模式基础理论和基于图形模式的推理从概率模式中随机变量之间的条件独立性,图形模式中结点之间的<WP=154>d-separation(或s-separation)性,以及二者之间的联系三个方面对图形模式的基础理论进行了概述。对贝叶斯网络基础理论中的核心概念d-separation标准,给出了非否定形式的定义(原定义以否定形式给出,很难理解),并介绍了有助于理解d-separation标准的两个贝叶斯网络模型(信息管道模型和小球模型)。分别从概率推断,证据传递和因果分析等方面对基于图形模式的推理进行了系统的阐述和分析,并结合例子予以必要的说明。5.图形模式分类器在图形模式学习方法的基础上,分别建立了基于类约束图形模式分类器的学习方法和一般图形模式分类器的学习和优化方法,并在0-1损失下给出了图形模式分类器的最优性证明。同时介绍了常用的分类器分类准确性估计方法和不同分类器分类准确性比较方法。6.基于图形模式的特征子集选择特征子集选择是一个尽可能多的排除不相关和冗余特征以优化分类器性能的过程,是机器学习、模
【Abstract】 Graphical model is a synergy between probabilistic and graphical theory. As a graphical representation, where the nodes are random variables and where the edges(directed or undirected) specify the conditional independence assumptions between these variables, of knowledge, it has become a powerful tool to represent knowledge and reason under uncertainty in a way akin to human intelligence and has been extensively used in expert system, decision analysis, pattern recognition, machine learning and data mining. It has been a rapidly growing research field and has seen a great deal of activity in recent years. The graphical model consists of structure (graph) and parameters (conditional or marginal probabilistic distribution). The structural part describes the dependent relationship between variables quantitatively and the parameters part gives a qualitative description. There are many research topics for graphical model. Some important ones include Bayesian network (directed acyclic graph), Markov network (undirected graph) and chain graph (hybrid graph with both directed and undirected edges). The primary research topics of this dissertation are Bayesian network and Markov network. A brief introduction to chain graph is also included. Applications of graphical model to intelligent data processing and transformations from data to knowledge and from knowledge to intelligence are stressed. The concrete research topics of the dissertation are as follows:1. Graphical Model Learning with Complete Data and Discrete VariablesSome state-of-art algorithms are introduced and analyzed. Two Bayesian network and Markov network structure learning methods are presented. One uses dependency analysis and causal semantic to direct edges and the other is based on basic dependency relationship between variables. By the two methods, questions exist in the state-of-art structure learning algorithms, such as exponential complexity, local optimum in the score-search methods, high-order conditional probability computation and limitation in edge directing, are settled down. Two precision evaluating methods for Bayesian learning algorithms are also introduced. <WP=156>2. Graphical Model Learning with Incomplete Data and Discrete VariablesIt is quite common that the data available in the structures learning process is incomplete. The incomplete data makes the learning of graphical model directly from data impossible. Therefore, it is a hard and impendent issue to learn graphical model from incomplete data. A typical method to deal with incomplete data is to utilize EM algorithm or some gradient-based optimization methods together with the score-search method. A new method that utilizes the learned structure and Gibbs sampling is presented. As the Gibbs sampling process can converge to global stationary distribution, local optimum problem comes along with the EM algorithm and other gradient-based optimization method can be avoided. In iteration, the joint probability is decomposed according to the learned graphical model and the sampling efficiency improves. The graphical model is regulated continuously and can get to the global stationary distribution asymptotically. The iteration process continues until the stopping criteria are satisfied. Three missing data mechanisms are studied. 1) missing at random: values for each variable are incomplete, but dimensions of variables and some cases are available. 2) with hidden variables or clustering variable: values for the hidden variables or clustering variable is completely missing, no information is available about these variables. 3) data missing in small data set. Multiple cases are unobservable, but dimensions of variables and some cases are available. Methods and algorithms dealing with those three data missing mechanisms are analyzed. Problems with those methods are pointed out. New methods are proposed, and some necessary theories and experimental analyses are given. It is observed that graphical model learning with continuous variables can be transformed into graphical model learning with
【Key words】 Bayesian network; Markov network; chain graph; moral graph; Markov blanket; discrete variable; continuous variable; complete data; missing data; structure learning; asymptotical learning; classifier; hidden variable; clustering; Gibbs sampling; MDL criterion; mutual information; conditional mutual information; feature subset selection;