节点文献
结合评论与评分的用户动态兴趣偏好的推荐算法研究
A Study of Recommendation Algorithms Based on User Dynamic Internet Preference Based on Reviews and Ratings
【作者】 李敏;
【导师】 朱志国;
【作者基本信息】 东北财经大学 , 管理科学与工程, 2017, 硕士
【摘要】 近年来,随着互联网技术的发展,推荐系统技术运用到各个领域,对各种推荐算法的研究也随即产生,其中基于用户评论与评分的推荐算法应用到许多存在用户评论信息的在线推荐系统中。推荐系统的中心是给需要的用户做推荐,因此构建用户兴趣偏好模型是必不可少的一步。在构建用户模型时,用户评论数据量大且没有中心话题,如果评论数据不经过任何处理而直接用在构建模型上将会使推荐效率低且无法有效快速地捕捉到用户的偏好所在。另外,由于用户的偏好并不是一层不变的,会随着时间变化,但是以往学者在分析用户评论和评分时没有考虑到这个因素,从而致使推荐质量欠佳,推荐实时性差。因此,从用户评论和用户评分中提取出用户的偏好,并通过评论时间对用户偏好进行动态分析,将具有非常重要的理论研究与实践应用价值。用户偏好模型的建立通过分析用户评论和评分来获取信息,但是这些数据量极大,获取和迅速分析这些数据量庞大的信息难度很大,实现起来困难重重。因此,通过总结相关研究发现,用户偏好模型的建立一般可以通过关键词表示法和主题表示法。关键词表示法通过分析一些和用户日常行为偏好有关的关键字词来表示;主题表示法主要是在构建用户的兴趣偏好时通过用户的评论信息资源主题类型词来表示。根据本文的研究内容,本文采用主题表示法构建用户偏好模型。但是,仅仅通过主题表示法所构建的偏好模型是静态的,此模型并不能根据时间的变化对用户偏好快速做出调整。因此,若想要追踪用户动态的偏好,还需要进一步的研究。针对以上所诉面临的挑战,本文分析研究了国内外学者对基于用户偏好的推荐算法的相关整理之后,在已有的结合用户评论和评分的推荐算法上进行创新。本文的主要研究工作如下:(1)在分析用户评论构建偏好模型时,对评论数据集进行了基于LDA(Latent Dirichlet Allocation)的主题模型分析,得出评论的文档-主题分布向量和主题-词分布向量,便于下一步准确地构建用户偏好模型。(2)考虑到用户的偏好是随时间变化的,所以在基于主题模型的基础上,加入了非线性遗忘时间函数,提出了基于主题模型和遗忘时间函数的混合偏好模型。(3)当考虑到用户对物品的情感态度时,通过用户对该物品的评分的大小来间接反映。由于用户在不同的时间段对相似物品的评分是不一样的,所以,必须通过一定的方式区分出评分的衡量效果。因此,本文在用户评分上加入了指数时间函数,不同用户对相类似的物品的评论时间间隔越短,那么评分对该用户的印象力将会变得越小。(4)提出来结合用户评论和用户评分的协同过滤算法,并在Amazon网站的6个数据集上进行实验。通过本文的研究,最终得出以下结论:(1)当数据量庞大时,为了获取和迅速分析用户信息资源,可以通过LDA主题模型进行信息聚类,快速获得用户的偏好。(2)通过数据集的验证,在基于用户评论和评分的用户偏好模型中加入评论时间可以提高预测准确率。(3)将本文提出的算法和其他学者提出的算法在同一个Amazon电子评论数据集上做实验,本文的算法在预测准确率上有较大的提高。
【Abstract】 In recent years,with the development of Internet technology,recommendation system technology applied to various fields,also with suitable for different areas of the recommendation algorithm.And recommendation algorithm based on user comments and ratings have been applied to many users comment on information in the online recommenation system.The center of the recommendation system is recommend user need to do,so build the user interest preference model is an essential step.When constructing user model,user review of large amount of data and there is no central topic,if comments without any data processing and direct use in building model,general will lower the efficiency of recommendation and not effective to quickly capture the user’s preferences.Another important factor to influence the user preference is time.Because the user’s preferences is not a layer of constant,will change over time,but the previous scholars on the analysis of user reviews and user ratings without considering the factors,which cause poor recommendation quality,recommend the real-time performance.Comments from users and user ratings,therefore,to extract the user’s preferences,and through the comments timestamp to carry on the dynamic analysis of the user preferences will have very important theoretical research and practical application value.User preference model based on the analysis of user reviews and ratings to obtain information,but the amount of data is great,access and rapid analysis of the data quantity of information is very difficult,hard to implement.Therefore,through summarizing the related research found that the user preference model can generally by keywords representation and theme representation.Keywords notation refers to recommend system described by a set of keywords to represent the user’s preferences model of user preferences.Theme notation refers to recommend system adopts the theme of the user’s interests type word to build the user interest model.Because this article research needs,this paper adopts the theme representation model building user preference.However,at this point the user preference model is static,to track dynamic user preferences,further studies are needed.In view of the above challenges,this paper analyzes the current recommendation system of domestic and foreign research present situation and the main recommendation technology,the advantages and disadvantages of the existing of the recommendation algorithm based on user comments and user ratings on innovation.In this paper,the main research work is as follows:(1)based on the analysis of user reviews build preference model,to review the data set was carried out based on the theme of the LDA model,draw a review document-topic distribution vector and theme-word distribution vector,to facilitate the next step to build user preference model accurately.(2)considering the user’s preferences change over time,so on the basis of the model based on the theme,joined the nonlinear forgotten time function,put forward the model based on the theme and time forgetting function of mixed preference model.(3)when considering the user for emotional attitude of items,to indirectly measure by user ratings.But different user ratings measure the effect of different time period,therefore,joined the index time function on user ratings,reviews the shorter the time interval,scoring the smaller the impact on the user preferences.(4)carry out collaborative filtering algorithm based on user comments and user ratings,and Amazon website in the six experimental data set.Through the research work of this article,finally draw the following conclusions:(1)when large amounts of data,in order to obtain user information resources,and rapid analysis in this article,through the LDA information clustering topic model,quickly get the user’s preferences.(2)through the verification of the data set,the user preference model based on user reviews and ratings to join in the review time can improve the prediction accuracy.(3)on the same data set,the complete algorithm is proposed in this paper and other scholars put forward algorithm,this algorithm has a lot to improve on the forecasting accuracy.
- 【网络出版投稿人】 东北财经大学 【网络出版年期】2018年 07期
- 【分类号】TP391.3;F713.36;F274
- 【被引频次】1
- 【下载频次】170