节点文献
基于语义相似度的文本表示降维方法
Dimension Reduction for Text Expression Based on Semantic Similarity
【摘要】 数据降维是文本表示中不可或缺的一个环节,有效的数据降维方法不仅能够减少计算量,同时有助于文本处理精度的提高。不同于传统的利用统计信息进行降维的方法,本文提出了一种基于词汇的语义相似度的文本表示的降维方法,该方法结合自然语言处理的知识,在降维环节考虑了特征词的语义信息和词性信息。实验结果表明:该方法能够有效地降低文本表示的维数,并在降维后的空间获得较高的文本处理精度,基于语义相似度的降维方法是一种适合文本处理的降维方法。
【Abstract】 Data dimension reduction plays an important role in the field of text expression.An effective dimension reduction method can not only reduce the amount of calculation,but help to improve the accuracy of text classification.The paper presents a new method of dimension reduction which is based on word semantic similarity.Being different from the traditional method which usually uses the statistical information of word,natural language processing knowledge is used in our method which considers semantic information and POS information of feature terms.The experimental result shows that the method is effective in dimensionality reduction of text expression and achieves a higher accuracy.The method based on semantic similarity is a suitable method.
- 【文献出处】 河南科技大学学报(自然科学版) ,Journal of Henan University of Science & Technology(Natural Science) , 编辑部邮箱 ,2008年05期
- 【分类号】TP391.1
- 【被引频次】9
- 【下载频次】292