节点文献
基于相邻词的中文关键词自动抽取
Chinese Keyword Extraction Algorithm Based on Neighbour Words
【摘要】 文档关键词概括了文档的主题和内容,在信息检索、文本分类、文本聚类等领域有着重要应用。在总结前人研究成果的基础上,提出了一种基于相邻词的中文关键词自动抽取算法。在对50篇学术论文自动抽取关键词的实验中,采用精确匹配的评价获得了38.9%的精度和34.9%的召回率,采用近似匹配的评价获得了70.7%的精度和68.8%的召回率,能够为进一步的研究提供帮助。
【Abstract】 Document Keywords,which make a general description of document topic and content,are used in information retrieval,document classification and clustering.A neighbour word based Chinese keyword extraction algorithm is proposed,based on previous research.Experiments are performed on a set of 50 academic paper.Evaluation results achieve a precision of 38.9% and a recall of 34.9% in exact match,and a precision of 70.7% and a recall of 68.8% in near match.The algorithm introduced here can be helpful in further research.
【基金】 国家“973”计划基金资助项目(2004CB318108);国家自然科学基金资助项目(60223004,60321002,60303005,60503064);教育部科学技术研究重点项目(104236)
- 【文献出处】 广西师范大学学报(自然科学版) ,Journal of Guangxi Normal University(Natural Science Edition) , 编辑部邮箱 ,2007年02期
- 【分类号】TP391.1
- 【被引频次】21
- 【下载频次】401