节点文献

基于ANN/HMM的时序模式识别方法研究

Research of Time-Ordered Pattern Recognition Based on ANN/HMM

【作者】 王晓燕

【导师】 李海峰;

【作者基本信息】 哈尔滨工业大学 , 计算机科学与技术, 2007, 硕士

【摘要】 手写输入与语音输入是人机交互中最自然、最符合人类习惯的交互方式,而手写体识别与语音识别也是多模式人机交互领域中的重要研究对象。这两种对象具有一个共同的特点,那就是时序性。本文主要研究一种以人工神经网络(ANN)和隐马尔可夫模型(HMM)为基本模型的时序模式通用识别方法。ANN具有抗噪声、自适应、学习能力强、识别速度快等特点,HMM具有较强的处理时间序列的能力。因此在本文中,将HMM作为整个待识别对象的结构模型,模拟时序模式状态之间的转换。用ANN来估计信号帧在HMM状态上的概率,并模拟HMM的各个状态,对待识别对象的基本单元模型进行建模。另外,针对传统的识别模型结构简单,自适应性差的问题,本文提出了一种自动增减状态数目的模型状态优化方法。这种方法可以根据训练样本的具体情况,对建模精度不够的状态进行拆分,对冗余状态进行删除,最终达到一个合适的数量。我们以手写符号识别和语音识别为实验对象,将上述方法与传统方法做了比较。结果表明,这种方法在不降低识别率的情况下,不但可以提高建模精度,并且节省了25%的系统资源。为了将研究成果实用化,利用上述识别模型及状态优化方法,我们开发了一个简单的多模式人机交互系统,人们可以利用手写输入或语音输入向计算机发布命令,交互方式简单自然。另外,该系统具有结构简单,响应速度快,识别率高等特点。手写命令符号的识别率达到了98%,语音命令的识别率达到83.6%,已经能够初步满足一般的应用需求。

【Abstract】 In Human-Machine Interaction, handwriting input and speech input are two methods which are the most naturally and accustomed ones for human beings. Handwriting and speech recognition are two important research objects in multi-mode Human-Machine Interaction technology. The two objects both have the characteristic of time-ordered. In this paper, we expect to find a new method which can be a common one for time-ordered patterns, based on Artificial Neural Network (ANN) and Hidden Markov Model (HMM). ANN has some features of anti-noise, auto-adaptation, strong capability of learning and high speed, and HMM is capable to deal with time series. So in this paper, HMM is used as the model of the whole pattern to be recognized, and simulates the transition between the states of the time series. ANN is a state emission probability estimator for HMM. It simulates the states of HMM, and is used as primitive model of the pattern to be recognized.In addition, considering the problems of simple structure and weak capability of auto-adaptation, we proposed an auto-split-and-merge method to determine the state number of a model. In this method, states are automatically added or deleted on a proper position according to the training data. We split the states with low modeling precision, delete the redundant ones, and finally achieve a balance. Taking handwriting symbols and speech commands for example, we compare the modeling effects between this method and traditional ones. The results show that, this advanced method can improve the modeling precision and save 25% of system resource.In order to put the research achievement into use, we developed a simple multi-mode Human-Machine Interaction system. In this system, we can write symbols or speak to give orders to the computer in a more natural way. In addition, it has the characteristics of simple structure, short response time and high recognition rate. We can obtain a recognition rate of 98 % for handwriting symbols, and 83.6% for speech commands. This is enough for common use.

  • 【分类号】TP391.41
  • 【被引频次】13
  • 【下载频次】336
节点文献中: 

本文链接的文献网络图示:

本文的引文网络