节点文献

基于决策树的中文对话省略句判别

Discriminating Elliptical Sentences in Chinese Dialogue Based on Decision Tree

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 张伟男张宇刘挺

【Author】 Weinan Zhang,Yu Zhang,Ting Liu Information Retrieval Lab,Harbin Institute of Technology,Harbin,150001

【机构】 哈尔滨工业大学信息检索研究室

【摘要】 省略句的判别是省略恢复的前序工作,在中文对话及问答系统中广泛存在着省略的现象,省略句判别的准确与否直接关系到省略恢复的结果,因此对省略句的判别则尤为重要。本文给出了一种采用决策树分类算法进行中文对话中的省略句判别的方法,采用手工收集的访谈类对话和TREC2004-2007的部分翻译句子为语料,选取了6个特征作为决策树分类器的条件属性,以完全利用规则实现的省略句判别方法作为baseline,本文的方法得到了较好的效果。实验结果显示,对省略句判别的准确率为97.4%,F指数为84.1%。

【Abstract】 Discriminating elliptical sentences was the previous work for ellipsis recovering,The phenomenon of ellipsis was widely existed in Chinese dialogue and question answering system.Whether the elliptical sentences was discriminated correctly or not,directly impacted the results of ellipsis recovering.So elliptical sentences discrimination was very important for that.This paper used decision tree as a classifier for discriminating elliptical sentences in Chinese dialogue,the corpus was collected from two parts one of which was the talk show dialogue and the other was a part of translations of TREC2004-2007 corpus.Six features was selected as conditional attributes for the classification algorithm.The experimental results are much better than baseline which is implemented by rules only for elliptical sentences discrimination.For elliptical sentences discrimination ,the precision is 97.4%and the F-measure is 84.1%.

【基金】 国家自然科学基金重点项目,60736044;国家自然科学基金面上项目,60675034863;计划探索类专题项目,2008AA01Z144
  • 【会议录名称】 第五届全国信息检索学术会议论文集
  • 【会议名称】第五届全国信息检索学术会议
  • 【会议时间】2009-11-14
  • 【会议地点】中国上海
  • 【分类号】TP391.1
  • 【主办单位】中国中文信息学会信息检索与内容安全专业委员会
节点文献中: 

本文链接的文献网络图示:

本文的引文网络