节点文献
兼类词排歧的一种方法
A Method about Grammatical Category Disambiguation
【Author】 Wang Jie Xun Endong Song Rou (Beijing Language and Culture University, Beijing 100083, China);
【机构】 北京语言大学语言信息处理研究所;
【摘要】 词性标注的关键是兼类词的排歧。本文探讨了一种方法来解决兼类问题,并以动词中的兼类词为切入点做了实验,即利用非兼类动词(纯动词)在大规模语料中的分布信息来判断具体上下文中出现的某个与动词有关的兼类词的词性。这种方法不需要人工标注好词性的训练语料,所需知识仅仅停留在词表一级,而且对其他存在兼类现象的语言也同样适用。实验结果证明了该方法的可行性。
【Abstract】 Grammatical Category Disambiguation is a key problem in Part of Speech Tagging. This paper proposes a method to disambiguate. We experimented with the Chinese verbs that can be used as nouns or adjectives. The part of speech which is related with a certain verb that can be used as a noun or an adjective in a specific context is determined by the distribution information of the verbs that have only one part of speech (pure verb) in a large scale corpus. The tagged training corpus is unnecessary and the only knowledge needed is some wordlists. This method can also apply to other natural languages that have phenomenon of one word having more than one part of speech. The experiment results proved the feasibility of the method.
- 【会议录名称】 第二届全国学生计算语言学研讨会论文集
- 【会议名称】第二届全国学生计算语言学研讨会
- 【会议时间】2004-08
- 【分类号】H087
- 【主办单位】中国中文信息学会