节点文献
基于改进注意力机制的短文本情感分析研究
Research on Improved Attention Mechanism Based on Short Text Sentiment Analysis
【作者】 杨帆;
【导师】 莫益军;
【作者基本信息】 华中科技大学 , 信息与通信工程, 2019, 硕士
【摘要】 随着社交媒体的快速发展,越来越多用户在互联网上发表言论、分享观点,这些用户数据无论是对消费者、企业或者政府部门都极具有商业价值和社会信息,因此大量研究者们着力于文本情感分析技术,进一步分析文本主观成分包括用户观点、情感倾向等。传统的情感分析主要集中于篇章级和句子级,且在一些数据集上取得了良好的结果,但是考虑到文本情感表达的实体或者对象,需要更细粒度的情感分析,本文提出了一种基于改进注意力机制的方面级情感分析算法,首先针对短文本内容简短,语句省略的现象,我们提出了一种基于词共现的文本特征提取方法,通过构建基于语义相似度权重的词共现矩阵提取文本的词共现特征;其次,我们提出Aspect-Attention记忆网络模型,通过双向LSTM学习表征文本的方面特征,并构建Aspect-Attention机制挖掘文本上下文语义与方面特征之间的潜在语义联系;最后引入词性特征,提出基于多特征融合的张量神经网络,通过构建张量神经网络挖掘文本不同特征,包括词性特征、文本上下文特征以及方面特征之间的语义联系,从而进一步表征文本上下文语义与方面特征之间的情感倾向。此外为了减少张量权重的冗余信息并减少权重参数,我们利用张量分解技术对张量权重进行降维,进行降维后的特征权重既可以表示多种特征之间潜在关联同时也控制了模型训练规模。本文的算法在SemEval-2016 Task5中文数据集进行对比实验,实验结果显示本文算法在方面级情感分类任务上较其他算法能够达到更高的分类准确率以及F1值等。
【Abstract】 With the rapid development of social media,more and more users are making comments and sharing opinions on the Internet.These user-generated data are extremely has the great commercial value and social information for consumers,enterprises and government departments.Therefore,a large number of researchers are focusing on sentiment analysis of texts,they expect to further analyze the subjective components of the texts,including user opinions and emotional tendencies.Generally,Traditional sentiment analysis focuses on document-level and sentence-level,and has achieved good results on some datasets.However,considering the entities or aspects of emotional expression,sentiment analysis requires more fine-grained analysis.In this paper,we propose an improved attention mechanism based algorithm for aspect-level sentiment analysis.Firstly,considering the absence of content in texts,we propose a word co-occurrence based text feature extraction method,we construct the word co-occurrence matrix based on semantic similarity weights to extracts the co-occurrence features of texts.Secondly,an Aspect-Attention memory network model is proposed in this paper,we first adopt Bi-LSTM to learn aspect-level representation of texts.Then an Aspect-Attention mechanism are constructed to extract the the potential semantic connection between the context features and aspect features of texts.The potential semantic connection is introduced.Finally,the POS features are introduced,a multi-features fusion based tensor neural network is proposed,we construct a tensor neural network to extract the semantic relationship between different features of the text,including the POS features,context features and aspect features.In addition,in order to reduce the redundant information of tensor weight and reduce the number of weight parameters,we use the tensor decomposition technique to reduce the number of tensor weight.The feature weight though dimensionality reduction not only can represent thepotential association between multi-features but also control model training scale.
【Key words】 Word co-occurrence feature; Aspect-Attention; Tensor neural network; Tensor decomposition;