节点文献

基于概念和关联扩充的文本标题分类机制

Mechanism for Title Classification Based on Conceptual and Associated Expansion

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 郑海林鸿飞杨志豪付建文

【Author】 ZHENG Hai 1, LIN Hong-fei 2, YANG Zhi-hao 2, FU Jian-wen 2 1 (Department of Navigation, Dalian Naval Academy, Dalian 116018, China) 2 (Department of computer, Dalian University of Technology, Dalian 116024, China)

【机构】 海军大连水面舰艇学院航海系大连理工大学计算机系大连理工大学计算机系 辽宁大连116018辽宁大连116024辽宁大连116024

【摘要】 文本分类是处理电子可读文本的重要手段,本文提出了基于标题的文本分类机制.其基本思想是:鉴于文本标题的重要性和简洁性,利用汉语语义分类树寻求概念上的扩充,利用语料库的关联矩阵,进行关联扩充,以丰富标题的语义内涵,从而获取较高精度的文本分类结果.该方法不依赖于汉语分析器和相应的领域知识库,速度较快,应用面较广.

【Abstract】 Text classification plays an important role in processing readable online texts. Text classification approach based on text titles is presented. Its main idea is shown as follows: considering the significance and concision of text titles, Concept expansion is performed with Chinese semantic classified tree; and association expansion is executed with the associated matrix derived from corpus. These expansions aim at enriching the meanings of text titles in synonymous and collocation relationships. The similarities between expanded feature vectors of classes and that of titles are used to determine the classes which texts belong to. It is independent of Chinese parser and domain knowledge bases, and it is easy to apply in wide range and its speed is fast.

【基金】 国家自然科学基金项目 (60 3 73 0 95 )资助
  • 【文献出处】 小型微型计算机系统 ,Mini-micro Systems , 编辑部邮箱 ,2005年05期
  • 【分类号】TP391.1
  • 【被引频次】4
  • 【下载频次】169
节点文献中: 

本文链接的文献网络图示:

本文的引文网络