节点文献
面向产品需求分析的事件抽取研究
Research on Event Extraction for Product Requirement Analysis
【作者】 刘莉莉;
【导师】 许威;
【作者基本信息】 西安电子科技大学 , 机械制造及其自动化, 2015, 硕士
【摘要】 随着社会和科技的发展,我们的生活中到处都是电子产品,现代人的办公也是无纸化的。可是面对大量的信息如何获得我们想要的关键内容就是本课题的研究内容。事件抽取是信息抽取中的关键任务,它的目的是将非结构化的文本转换为结构化的文本并呈现出来。要抽取的内容是我们指定的,例如人物、时间、地点等等。利用机器学习方法来研究事件抽取时,需要有较大规模的人工标注的语料库,而目前可以利用的较大规模的人工标注的语料库非常少。利用统计模型进行分类时,怎样选择合适的特征准确地描述数据是分类任务的核心问题。针对这两个问题,本文做了相关的探索工作。首先,本文开发了一个文本标注工具,利用该标注工具构建了面向产品需求分析的事件语料库。构建语料库时,本文先对中文语句从结构上和语法上进行了分析,为后期的事件划分工作和事件标注工作奠定基础。其次,为了研究事件抽取任务,本文分别对事件类别识别和事件元素抽取的研究现状进行了总结,提出了抽象原则、信息增删原则、介词重要性原则以及信息对齐原则等事件抽取任务所遵循的几个原则,根据这几个原则我们提出了句子成分排序方法、介词特征和副词特征,通过对比实验来观察本文中所给出的特征对事件抽取任务的影响,实验数据证明了我们的方法是有效的。最后,本文设计并实现了一个完整的面向产品需求分析的事件抽取系统,在产品信息语料上进行了测试,实验证明该系统可以有效地完成相关的事件抽取任务。
【Abstract】 With the development of society and technology, our life is full of electronic products, the modern man’s office is paperless. But in the face of a large amount of information how to get the key content of the information is the research content of this topic. Event extraction is a key task in information extraction. It is intended to convert unstructured text into structured text and present it. To extract the content that we specify, such as characters, time, place, etc.When using machine learning method to study event extraction, it is needed to have a large scale corpus, which is annotated by hand. But the available corpus is limited. When using statistical model to classify, how to select the appropriate features to accurately describe the data is the core problem of the classification task. In view of these two questions, this paper has done the relevant exploration work.Firstly, this paper develops a text annotation tool, which is used to construct the event corpus for product requirement analysis. When constructing a corpus, this paper first analyzes the Chinese sentence structure and syntax, which lays the foundation for the later event classification and event marking.Secondly, in order to study the event extraction task, this paper respectively to the event type recognition and event element extraction research status are summarized, put forward a few principles that the event extraction task to follow, such as abstraction principle, addition and deletion of information principle, principle of the importance of the preposition and alignment information principle, according to these principles we propose sentence ranking method, prepositions and adverbs characteristics. Through the contrast experiment to observe the influence of characteristics are given in this paper for the event extraction task, the experimental data prove that our method is effective.Finally, we design and implement a complete event extraction system for product requirement analysis. It is tested on the product information corpus. The experiment proves that our system can effectively accomplish the related event extraction task.
【Key words】 machine learning; event extraction; event type; event element; features;
- 【网络出版投稿人】 西安电子科技大学 【网络出版年期】2017年 03期
- 【分类号】TP391.1
- 【被引频次】1
- 【下载频次】81