节点文献
基于词分布表征的汉语框架排歧研究
Chinese Frame Disambiguation Base on Word Distributed Representations
【摘要】 框架排歧目的在于根据句子中目标词的上下文环境,从现有的框架库中为该目标词自动标注一个合适的框架.将框架排歧任务看作分类问题,首次将词的低维分布表征信息作为模型特征引入到汉语框架排歧研究中,来探讨仅从词特征出发,不同的特征表示对框架排歧模型的影响.实验选取了88个词元中2 077条例句为数据集,并将目标词周围的词分布表征信息加入到最大熵算法中进行建模.实验结果表明,使用词分布表征信息的框架排歧模型可以达到58.11%的精度,该结果与传统的仅使用词特征时(47.47%)的结果相比有大幅度提高.这说明词分布表征对汉语框架排歧任务是有重要作用的.
【Abstract】 The purpose of frame disambiguation is to select a proper frame from all frames in CFN for a target word of a Chinese sentence,based on the context of the target word.Frame disambiguation is regarded as a classification task between frames,and we firstly introduce word low dimension distributed representations as features to investigate the influence of different feature representations on frame disambiguation model only proceed from the word feature.We selected 2 077 annotated sentences from 88 lexical units as our dataset,and introduced the distributed representations of words around the target word into maximum entropy algorithm for the model building.Experimental results show that the accuracy of our proposed frame disambiguation model reaches 58.11%.Compared with the result(47.47%)that only use word features,this result get increased significantly,and it shows that word distributed representations is so important to frame disambiguation.
【Key words】 frame disambiguation; maximum entropy model; word distributed representations; Chinese frame net;
- 【文献出处】 中北大学学报(自然科学版) ,Journal of North University of China(Natural Science Edition) , 编辑部邮箱 ,2015年03期
- 【分类号】TP391.1
- 【被引频次】10
- 【下载频次】68