节点文献
基于主动学习的中文问题类别标注研究
Labeling Chinese Question based on Active Learning
【Author】 Youdong Miao,Xipeng Qiu and Xuanjing Huang Fudan University,200233
【机构】 复旦大学;
【摘要】 在开放领域问题回答研究中,问题分类是首要面对的问题,也是影响问答系统性能的关键。而目前问题分类语料规模都比较小,难以满足实际应用中问题分类的需要。本文根据HOWNET建立一套问题分类的分类体系,并使用主动学习的方法进行中文问题类别标注。此外,我们还通过特征选择来提高标注性能。实验证明,基于主动学习的标注方法在需要较小人工标注同时取得很好的分类性能,并且在一定程度上还可以明显提高问题分类的正确率。
【Abstract】 Question classification is the first important problems in open domain question answering(QA) system. The current corpora of question classification are relatively small and difficult to meet the practical needs of QA system.In this paper,we firstly establish the taxonomy of question according to HOWNET,and then use active learning methods for question labeling.In addition,we also improve performance of labeling with feature selection.Experimental results show that active learning-based labeling method achieved very good classification performance with less manual annotation tagging,which can significantly improve the accuracy of classification to some degree.
【Key words】 Active Learning; Passive Aggressive; Feature Selection; Chinese Question classification;
- 【会议录名称】 第六届全国信息检索学术会议论文集
- 【会议名称】第六届全国信息检索学术会议
- 【会议时间】2010-08-12
- 【会议地点】中国黑龙江牡丹江
- 【分类号】TP391.1
- 【主办单位】中国中文信息学会信息检索与内容安全专业委员会