节点文献
具有丢失数据的贝叶斯网络结构学习研究
Research on Learning Bayesian Networks Structure with Missing Data
【摘要】 目前主要基于EM算法和打分-搜索方法进行具有丢失数据的贝叶斯网络结构学习,算法效率较低,而且易于陷入局部最优结构.针对这些问题,建立了一种新的具有丢失数据的贝叶斯网络结构学习方法.首先随机初始化未观察到的数据,得到完整的数据集,并利用完整数据集建立最大似然树作为初始贝叶斯网络结构,然后进行迭代学习.在每一次迭代中,结合贝叶斯网络结构和Gibbssampling修正未观察到的数据,在新的完整数据集的基础上,基于变量之间的基本依赖关系和依赖分析思想调整贝叶斯网络结构,直到结构趋于稳定.该方法既解决了标准Gibbssampling指数复杂性问题,又避免了现有学习方法所存在的主要问题,为具有不完整数据的不确定性知识表示、推断和推理提供了有效和可行的方法.
【Abstract】 At present, the method of learning Bayesian network structure with missing data is mainly based on the search and scoring method combined with EM algorithm. The algorithm has low efficiency and easily gets into local optimal structure. In this paper, a new method of learning Bayesian network structure with missing data is presented. First, unobserved data are randomly initialized. As a result, a complete data set is got. Based on the complete data set, the maximum likelihood tree is built as an initialization Bayesian network structure. Second, unobserved data are reassigned by combining Bayesian network with Gibbs sampling. Third, on the basis of the new complete data set, the Bayesian network structure is regulated based on the basic dependency relationship between variables and dependency analysis method. Finally, the second and third steps are iterated until the structure goes stable. This method can avoide the exponential complexity of standard Gibbs sampling and the main problems in the existing algorithm. It provides an effective and applicable method for uncertain knowledge representation, inference, and reasoning with missing data.
【Key words】 Bayesian network; structure learning; missing data; Gibbs sampling; dependency analysis; maximum likelihood tree;
- 【文献出处】 软件学报 ,Journal of Software , 编辑部邮箱 ,2004年07期
- 【分类号】TP18
- 【被引频次】167
- 【下载频次】1366