节点文献

基于LDA的外卖用户评论挖掘与情感分析研究

Research on LDA Based Takeout User Reviews Mining and Sentiment Analysis

【作者】 黄婷

【导师】 段隆振; 熊均;

【作者基本信息】 南昌大学 , 软件工程(专业学位), 2022, 硕士

【摘要】 得益于互联网的蓬勃发展与电子商务的兴起,外卖平台迅速扩张,越来越多的消费者通过外卖点餐,因此关于外卖的点评数据也变得愈加庞大,而这海量的外卖评论的背后蕴含着巨大的商业价值,对商户及外卖平台甚至消费者自身都有着十分重要的意义。本文主要从情感分析与主题挖掘两大方面对美团平台的外卖评论进行分析从而提出改进建议。本文的主要研究内容如下:(1)针对情感分析中基础情感词典不完善导致评论情感识别较差的问题,本文将Word2Vec与SO-PMI进行融合,兼顾了情感候选词之间的上下语义关联和情感共现概率,扩展了餐饮外卖领域的正面及负面情感词,并将扩展后的词纳入原情感词典,进一步完善了词典。实验结果可以看出,融合Word2Vec与SO-PMI两者之后情感分类精度有了更高的提升,比通用情感词典的情感分类准确率高了4.2%,也分别比Word2Vec与SO-PMI高了2.5%与1%,验证了使用SO-PMI与Word2Vec结合后的领域情感词典具有更有效的情感分类识别性。(2)为了提升情感分类性能,本文将优化扩展后的情感词典与不同情感分类模型相结合。实验表明两者结合后的分类性能明显会优于单一使用情感词典或单一使用SVM、Naive Bayes、LSTM等情感分类器。且为了更好的泛化能力与更高的情感分辨率,本文调整与优化了朴素贝叶斯的拉普拉斯平滑参数、SVM的高斯核参数等,通过实验的多次综合对比中得出,使用融合扩展情感词典与LSTM模型的结合的分类准确度在这些分类模型中最高,分类性能最优。(3)本文基于LDA主题模型进行餐饮外卖评论正负面潜在主题的挖掘。通过计算余弦相似度确定两极性情感下的最优主题个数,分别提取其正面评论及负面评论的深层次主题关键词,也使用Py LDAVis进行LDA主题的可视化呈现。最终根据综合分析结果对商家与平台都给出相应的合理建议。

【Abstract】 Thanks to the vigorous development of the Internet and the rise of e-commerce,the takeout platform has expanded rapidly,and more and more consumers order through takeout.Therefore,the comment data on takeout has become more and more huge,and behind this massive takeout comment contains great commercial value,which is of great significance to merchants,takeout platforms and even consumers themselves.This paper mainly analyzes the takeout comments of meituan platform from two aspects: emotion analysis and theme mining,so as to put forward improvement suggestions.The main research contents of this paper are as follows:(1)In this paper,the semantic association between emotion and negative emotion in the dictionary is analyzed on the basis of emotion expansion of vec-2so,which leads to the imperfect semantic association between emotion and negative emotion in the dictionary.The experimental results show that the accuracy of emotion classification has been improved after integrating Word2 Vec and SO-PMI,which is4.2% higher than that of general emotion dictionary,and 2.5% and 1% higher than Word2 Vec and SO-PMI respectively.It is verified that the domain emotion dictionary combined with SO-PMI and word2 vec has more effective emotion classification and recognition.(2)In order to improve the performance of emotion classification more effectively,this paper combines the optimized extended emotion dictionary with different emotion classification models.Experiments show that the classification performance of the combination of the two is significantly better than that of using emotion dictionary alone or using emotion classifiers such as SVM,naive Bayes and LSTM alone.In order to better generalization ability and higher emotion resolution,this paper also adjusts and optimizes the Laplace smoothing parameters of naive Bayes and the Gaussian kernel parameters of SVM.Through multiple comprehensive comparisons of experiments,it is concluded that the classification accuracy of the combination of fusion extended emotion dictionary and LSTM model is the highest and the classification performance is the best among these classification models.(3)Based on LDA theme model,this paper mines the positive and negative potential themes of catering takeout comments.The optimal number of topics under bipolar emotion is determined by calculating cosine similarity,and the deep-seated topic keywords of positive comments and negative comments are extracted respectively.Pyldavis is also used for visual presentation of LDA topics.Finally,according to the comprehensive analysis results,the corresponding reasonable suggestions are given to businesses and platforms

  • 【网络出版投稿人】 南昌大学
  • 【网络出版年期】2023年 02期
  • 【分类号】TP391.1;F724.6;F719.3
节点文献中: 

本文链接的文献网络图示:

本文的引文网络