节点文献
中文不确定性句子的识别研究
Exploring uncertainty sentences in Chinese
【Author】 Feng Ji,Xipeng Qiu,Xuanjing Huang School of Computer Science and Technology,Fudan University,Shanghai 200233
【机构】 复旦大学计算机科学技术学院;
【摘要】 识别不确定性信息对于信息抽取类的任务有着重要作用,因为不确定性信息往往会误导这些系统抽取出错误的信息。本文提出了一种自动识别中文中不确定性句子的方法,利用不确定性句子中普遍存在的线索词的信息构建了句子的评分模型。同时Passive Aggressive算法,一种在线学习算法的变种,用于学习模型的参数。在中文不确定性句子识别的实验中证明,相比较于词袋(Bag of Words)的模型,我们的模型能够得到更好的F1值,达到了70.53%,提高了约5%。
【Abstract】 Identifying uncertainty information plays important roles in information extraction and similar systems, because uncertainty information would mislead these systems to extract wrong or unreliable information.In this paper,we proposed an automatic method for detecting uncertainty sentences in Chinese,in which cue words commonly exists in uncertainty sentences are employed to define the score of the whole sentence.Meanwhile,a variant online learning algorithm,named Passive Aggressive algorithm is used to train the model.On the Chinese experiments,the results are shown that our method are better than commonly used bag-of-words model,which achieve 70.53%in F1-measure and correspondingly about 5%improvement.
【Key words】 Uncertainty information; Passive Aggressive Algorithm; Chinese Information Processing;
- 【会议录名称】 第六届全国信息检索学术会议论文集
- 【会议名称】第六届全国信息检索学术会议
- 【会议时间】2010-08-12
- 【会议地点】中国黑龙江牡丹江
- 【分类号】TP391.1
- 【主办单位】中国中文信息学会信息检索与内容安全专业委员会