节点文献
基于概念属性特征的中文地名识别处理
Chinese Place Name Recognition Based on Concept Features
【Author】 Li Nuo~(12),Zhang Quan~2 1 Graduate University of Chinese Academy of Sciences Beijing 100039 2 Institute of Acoustics,Chinese Academy of Sciences Beijing 100190
【机构】 中国科学院研究生院; 中国科学院声学研究所;
【摘要】 在最大熵等统计机器学习模型当中,特征函数的选择可以说是对系统整体性能影响最大的部分。本文不仅使用了传统的词、词性等作为特征,同时基于HNC语言概念理论体系,以语义概念为特征进行训练。通过对语义概念符号的正确表示,把语义分析的内容加入到统计分析中去。把词语按照语义分类的部分属性加以利用。并且,本文中还尝试使用变长的训练窗口,对每一个特征刻画的更加仔细。最终在实际语料的测试中证明加入语义特征确实可以改进识别效果。
【Abstract】 The feature functions were reckoned as the most important part of the maximum entropy model which could affact the last result of system.In this paper,we no only select some common features as words and part-of-speech but also semantic features which based on HNC(Hierarchical Network of Concepts) theory.By semantic concept symbols,we could combine statistic analysis and semantic analysis together.Meanwhile,we also use moving window take the position of fixed window of maximum entropy model which could lead to more specific feature functions.As a result,it was demonstrated in real corpus test that semantic features surely contribute to better result.
【Key words】 Maximum; feature function; HNC theory; semantic concept features; moving window;
- 【会议录名称】 中国计算机语言学研究前沿进展(2007-2009)
- 【会议名称】第十届全国计算语言学学术会议
- 【会议时间】2009-07-24
- 【会议地点】中国山东烟台
- 【分类号】TP391.43
- 【主办单位】中国中文信息学会