节点文献
多字体印刷蒙文字识别技术的研究
Research of the Recognition of Multi-Font Printed Mongolian Characters
【作者】 李伟;
【导师】 高光来;
【作者基本信息】 内蒙古大学 , 计算机应用技术, 2004, 硕士
【摘要】 文字识别是集模式识别、人工智能与文字处理于一体的高新技术,能自动地把文字或其他信息通过智能识别输入计算机,用以代替人工输入。文字识别不仅有广泛的应用领域,而且也促进了模式识别、文字处理技术的发展。该领域一直是国际上计算机智能技术的研究热点,也是我国高技术研究计划(863计划)重点支持的主题。蒙古文是内蒙古自治区的主体民族语言,在中国,使用蒙古文的地区除了内蒙古自治区,还有黑龙江、吉林、辽宁、新疆等省和自治区。目前,大部分输入方法的研究集中在键盘编码输入方式上,对蒙文字识别的研究非常少,关于印刷体蒙文识别输入的研究还是一个空白点,这严重地制约了信息技术在少数民族地区的普及和应用。针对这种现状,我们提出研制多字体印刷蒙文识别系统,为蒙古文的输入提供一种智能的输入方式,这对继承和发展少数民族文化、促进民族地区的社会进步具有重要的意义。 蒙文在内蒙古自治区使用广泛,但输入均使用键盘编码输入方式,蒙文自动识别输入还是一个空白点。因此本课题的研究为蒙文输入提供了一种新的自动化和智能的方式,使蒙文信息处理达到一个新的水平。蒙古文字是拼音文字,但其书写方式在当今世界是非常独特的,与汉文和西文有很大不同。蒙文是从左到右、从上到下竖写,每个词中所有字母连着写,形成一个竖直的主干线,且每一个字母在一个词中的词首、词中和词尾所取的字形不一样。这些特点给蒙文的识别带来很大的困难。因而在研究的过程中,我们不仅要充分消化和吸收西文和汉文识别所采用的技术,还要结合蒙文书写的特点有所创新,才能较好地解决所遇到的困难。研究课题的目的是:从文字识别的角度来研究蒙文字特征的选择及特征提取、基元分割、匹配等一系列问题,开发出一个有良好人机界面,操作方便的多字体印刷蒙文识别系统。
【Abstract】 Character Recognition is a newly sophisticated technology, which involves Pattern Recognition, Artificial Intelligence, Character Processing. Automatic input of characters and other information can be realized through this intelligent recognition. Character Recognition not only has a wide range of application, but also facilitates the progress of Pattern Recognition and Character Processing. This technology is the research focus of international Computing Intelligence as well as the important subject of sophisticated technology research program in China. Mongolian is the main body language in Inner Mongolian autonomous region. In China, Mongolian is also used in Heilongjiang, Jilin, Liaoning, Xinjiang and so on. At present, most of input modes are using keyboard. Almost nothing was done about Mongolian recognition at that time, which seriously impeds the development and application of information technology in the Minority region. Under this circumstance, we propose the research of Multi-Font printed Mongolian characters recognition, which can not only provide an automatic method of Mongolian input, but also has a far-reaching meaning about inheriting and developing the Minority culture.Mongolian is widely used in Inner Mongolian autonomous region. But most of the Mongolian input is still using keyboard, automatic recognition input is just beginning. This subject provides a new, automatic and intelligent input mode, which carries Mongolian processing to a new and higher level. Mongolian is a kind of spelling characters, which has a very special written structure different from Chinese and English characters. Mongolian is written from left to right, from top to bottom, all letters are connected together to form a vertical backbone, and every letter may have different shapes in different positions. All these characteristics bring many difficulties to recognition procedure. So during the process of research, we should assimilate the experience and technology used in Chinese and English recognition, and at the same time create some new methods according to Mongolian written structure. Our research aim is: from the character recognition point of view, accomplishing Mongolian feature selection, feature extraction, primitive segmentation, matching etc, developing a Multi-Font printed Mongolian recognition system with desirable man-machine interface.
【Key words】 pattern recognition; character recognition; Mongolian; structural pattern recognition; structural feature;
- 【网络出版投稿人】 内蒙古大学 【网络出版年期】2004年 04期
- 【分类号】TP391.4
- 【被引频次】17
- 【下载频次】434