节点文献
文本分类综述及手机垃圾短信过滤方法的研究
Text Categorization and Filtering Method for Chinese Junk Short Message
【摘要】 主要介绍了文本分类问题,讨论了文本分类所涉及的关键技术,包括中文分词,文本表示,特征选取方法,以及Rocchio、朴素贝叶斯、K近邻、决策树、神经网络和支持向量机等文本分类算法的原理和方法.最后,给出了基于文本分类技术的中文垃圾短信过滤方法的实验和结果.
【Abstract】 Introduces the problem of text categorization, and its important techniques, including Chinese word segmen- tation, text representation, feature selection and extraction, and algorithms of text categorization such as Rocchio, Naive Bayes, KNN, tree decision, neural network, SVM and so on. Finally, experiment and result of filtering Chinese junk short message is given.
【关键词】 文本分类;
特征选取;
分类算法;
垃圾短信过滤;
【Key words】 text categorization; feature selection and extraction; categorization algorithm; filtering junk short message;
【Key words】 text categorization; feature selection and extraction; categorization algorithm; filtering junk short message;
【基金】 河北省自然科学基金(603073)
- 【文献出处】 河北工业大学学报 ,Journal of Hebei University of Technology , 编辑部邮箱 ,2007年01期
- 【分类号】TP391.1
- 【被引频次】29
- 【下载频次】979