节点文献

基于主题词迭代提取的信息检索算法

An Information Retrieval Algorithm Based on the Iterative Extraction of Keyphrases

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 赵英环郭贵锁

【Author】 Zhao Ying-huan Guo Gui-suo (College of Infonnation Science and Tech. , Beijing Institute of Tech. , Beijing 100081, China )

【机构】 北京理工大学信息科学技术学院北京理工大学信息科学技术学院 北京 100081北京 100081

【摘要】 为了让用户从海量知识信息中精确、快速地获取到感兴趣的信息内容,综合考虑文档的头部信息(标题、摘要、关键词)和重点主体内容,采用基于主题词迭代提取的信息检索算法,使得主题词的提取在兼顾效率的同时准确率达到83%以上,主题信息检索的性能也随之增加。实验结果表明,在文档查询词频和倒排文档频率(TF-IDF)的基础上对候选主题词相关度权值的计算进行合理化调整,并利用所提出的主题词迭代提取算法,主题信息检索的有效性将显著增加.

【Abstract】 In order to ensure that users accurately and quickly obtain what they are interested in from a great amount of knowledge information, an information retrieval algorithm based on the iterative extraction of key phrases is presented by synthetically considering the head information (title, abstract and keywords) and key body content of a document. By this algorithm, the precision rate of topic extraction reaches 83% with the efficiency being unaffected, and the performance of topic information retrieval is also increased. Experimental results indicate that, by reasonably adjusting the relativity weights of candidate keyphrases on the basis of Term Frequency-Inverse Document Frequency and employing the proposed iterative extraction algorithm, the retrieval of topic information can achieve remarkable precision.

  • 【文献出处】 华南理工大学学报(自然科学版) ,Journal of South China University of Technology(Natural Science) , 编辑部邮箱 ,2004年S1期
  • 【分类号】TP393.09
  • 【被引频次】5
  • 【下载频次】235
节点文献中: 

本文链接的文献网络图示:

本文的引文网络