节点文献
基于知识融合的在线文本分类算法——语义SVM
An On-line Text Categorization Algorithm Based on Information Fusion:Semantic SVM
【摘要】 为使支持向量机(SVM)更加适用于在线文本分类应用。利用SVM在小训练样本集条件下仍有高泛化能力的特性,结合文本特征向量在特征空间中具有聚类性的特点,提出一种用语义中心集代替原训练样本集作为训练样本和支持向量的SVM:语义SVM.文中给出了语义中心集的生成步骤、语义SVM的在线学习算法框架。以及基于SMO算法的在线学习算法的实现.实验结果表明,相对于标准SVM,语义SVM及其在线学习算法不仅在线学习速度和分类速度有数量级提高,而且在分类准确率方面具有一定优势.
【Abstract】 The aim of this paper is to make SVMs (Support Vector Machines) more applicable to on-line text categorization applications. As SVMs are of good generation ability even with small training sets and text feature vectors are clustery in the feature space, an algorithm for text categorization, namely, semantic Support Vector Machine ( Semantic SVM) , is proposed by substituting the original training text set with the semantic center set. This semantic center set is used as the training text and support vector candidates. The steps to generate the semantic center set and the framework of the on-line learning algorithm of semantic SVM are then presented, as well as the implementation of the on-line learning algorithm based on Sequential Minimal Optimization. Experimental results show that, compared with the standard SVMs, the proposed semantic SVM and its algorithm can improve the on-line learning speed and the classifying speed by orders with a high classifying veracity.
【Key words】 text categorizadon; Support Vector Machine; semantic Support Vector Machine; on-line learning;
- 【文献出处】 华南理工大学学报(自然科学版) ,Journal of South China University of Technology(Natural Science) , 编辑部邮箱 ,2004年S1期
- 【分类号】TP391.1
- 【被引频次】3
- 【下载频次】355