节点文献
基于用户浏览图的网页质量评估方法的比较分析
Analysis and Comparison to Web Page Quality Evaluation Algorithms Based on User Browsing Graph
【Author】 XUE Yufei,LIU Yiqun,ZHANG Min,MA Shaoping,RU Liyun State Key Lab of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Science and Technology Tsinghua University,Beijing,100084,China P.R.
【机构】 智能技术与系统国家重点实验室清华大学信息科学与技术国家实验室清华大学计算机系;
【摘要】 面对海量繁杂的网络数据环境,网页质量评估成为互联网搜索引擎面临的主要技术挑战之一,当前针对互联网网页评估的主要研究思路是基于网络超链接结构的分析完成。然而,Web2.0、搜索引擎结果优化(SEO),网络作弊等现象的出现严重影响了互联网超链接分析的可靠性。为此,基于用户互联网访问日志构建用户浏览关系图成为互联网网页质量评估的重要研究方向。本文基于海量规模真实网络用户行为数据和网页质量评估数据,对基于用户浏览关系图结构分析的几种主要网页质量评估算法进行了比较与分析,实验结果说明,将传统链接结构分析算法应用于用户浏览关系图,可以取得较好的网页质量评估效果。
【Abstract】 Web page quality analysis is one of the top challenges for practical commercial search engines.State of the art quality analysis algorithms are mostly based on hyperlink structure analysis.However,the this kind of algorithm doesn’t work well due to the change in Web structure caused by Web 2.0,search engine optimization(SEO) and Web spam.Therefore,user browsing graph has been paid much attention to as an important way in page quality estimation. In this paper,we compare 3 kinds of link analysis algorithms(PageRank,TrustRank and BrowseRank) on a user browsing graph which is constructed with large scale Web access log data.Experimental results show that traditional link analysis algorithms perform well on user browsing graph for the task of page quality estimation.
【Key words】 User browsing graph; PageRank; TnistRank; BrowseRank;
- 【会议录名称】 中国计算机语言学研究前沿进展(2007-2009)
- 【会议名称】第十届全国计算语言学学术会议
- 【会议时间】2009-07-24
- 【会议地点】中国山东烟台
- 【分类号】TP393.092
- 【主办单位】中国中文信息学会