节点文献
一个实用的古籍印刷汉字识别系统
A Printed Chinese Character Recognition System for Automatic Inputting of Ancient Books
【摘要】 本文采用Shannon理论,讨论了古籍印刷汉字识别字域地选择所受的约束,汉字特征提取的性能限度,以及如何用汉字的统计特性,进一步提高系统的识别率。在理论分析的基础上,经过大量实验研究,所完成的古籍印刷汉字识别系统对已标注过720万字的古籍录入显示了它的优越性能。
【Abstract】 In the light of Shannon theory,this paper discusses how to choose a Chinese charac-ter set for recognition,the limitation of character feature extraction and how to use the statis-tics of Chinese character to improve recognition rate. Based on this theoretical analysis and large amount of experiments, we built "A Printed Complex Chinese Character Recognition System for Automatic Inputting of Ancient Books". Using this system, we have inputted more thanseven million characters of ancient books and have gained a high recognition rate.
【关键词】 汉字识别;
古籍汉字录入;
识别字域;
特征提取;
【Key words】 Chinese character recognition; inputting of ancient books; character set forrecognition; feature extraction.;
【Key words】 Chinese character recognition; inputting of ancient books; character set forrecognition; feature extraction.;
- 【文献出处】 中文信息学报 ,Journal of Chinese Information Processing , 编辑部邮箱 ,1996年03期
- 【分类号】H124
- 【被引频次】1
- 【下载频次】142