节点文献

SHITS:一种基于超链接和内容的网页排序方法

SHITS:a WebPage Ranking Method Based on Hyperlink and Content

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 肖明军黄刘生罗永龙

【Author】 XIAO Ming-Jun1,HUANG Liu-Sheng2, LUO Yong-Long21 (Department of Electronic Engineering and Information Science,University of Science and Technology of China,Hefei 230027,China)2 (Department of Computer Science and Technology,University of Science and Technology of China, Hefei 230027,China)

【机构】 中国科学技术大学电子工程与信息科学系中国科学技术大学计算机科学技术系中国科学技术大学计算机科学技术系 安徽合肥230027安徽合肥230027

【摘要】 回顾了当前应用于大型搜索引擎的主流网页排序算法,对其中的ARC算法进行了改进,提出了一种基于超链接和内容的网页排序算法—SHITS(Similarity-HITS)算法.SHITS算法用超链接所引用的网页内容代替了ARC算法中所采用的锚文本来评估该超链接的重要性,这一改进不仅提高了算法区分链接重要性的能力,也避免了对大量锚文本内容的分析.通过与相关算法的对比实验,结果表明SHITS算法网页排序的准确率明显优于其它算法.此外,SHITS算法也具有较好的效率计算代价小于ARC算法,与HITS算法相当.

【Abstract】 This paper reviews currently dominating webpage ranking algorithms,improves the ARC algorithm among of them,and proposes an algorithm based on hyperlink and content—the SHITS(Similarity-HITS)algorithm.The SHITS algorithm uses the webpage content cited by the hyperlinks to evaluate the importance of these hyperlinks instead of the anchors used in the ARC algorithm,which not only improves the ability to differentiate the importance of hyperlinks,but also needn’t analyze the content of the numerous anchors in web pages.From the contrastive experiment with the related algorithms, the result shows that the precision of the SHITS algorithm was significantly higher than that of other algorithms. Furthermore, the SHITS algorithm has a good performance:its computational cost is smaller than that of the ARC algorithm, and approximate to that of the HITS algorithm.

【基金】 国家“九七三”计划项目(2003CB17000)资助
  • 【文献出处】 小型微型计算机系统 ,Journal of Chinese Computer Systems , 编辑部邮箱 ,2006年12期
  • 【分类号】TP391.3
  • 【被引频次】18
  • 【下载频次】266
节点文献中: 

本文链接的文献网络图示:

本文的引文网络