节点文献
一种基于词汇链的关键词抽取方法
A Keyword Selection Method Based on Lexical Chains
【摘要】 关键词在文献检索、自动文摘、文本聚类/分类等方面有十分重要的作用。词汇链是由一系列词义相关的词语组成,最初被用于分析文本的结构。本文提出了利用词汇链进行中文文本关键词自动标引的方法,并给出了利用《知网》为知识库构建词汇链的算法。通过计算词义相似度首先构建词汇链,然后结合词频与区域特征进行关键词选择。该方法考虑了词汇之间的语义信息,能够改善关键词标引的性能。实验结果表明,与单纯的词频、区域方法相比,召回率提高了7.78%,准确率提高了9.33%。
【Abstract】 Keywords are very useful for information retrieval,automatic summarizing,text clustering/classificationand so on.Alexical chain is a series of related words and primarily used in text structure analyzing.The paper propo-ses a lexical-chain-based keywords indexing method for Chinese texts.And,an algorithm for constructing lexicalchains based on HowNet knowledge database is given.In the method,lexical chains are firstly constructed by calcu-lating the semantic similarity between terms,then keywords are selected through taking account of term frequency andarea.The experimental results shows that the performance of the system has a notable improvement by considering se-mantic relationship between terms,and the precision can be improved by9.33 percent and the recall can be improvedby 7.78 percent compared with term frequency and area.
【Key words】 computer application; Chinese information processing; keyword indexing; keyword extraction; lexicalchains; word similarity; HowNet;
- 【文献出处】 中文信息学报 ,Journal of Chinese Information Processing , 编辑部邮箱 ,2006年06期
- 【分类号】TP391.1;TP18
- 【被引频次】214
- 【下载频次】1560