节点文献

基于自动分类的网页机器人

Internet Robot Based on Automatic Classification

推荐 CAJ下载
PDF下载
不支持迅雷等下载工具，请取消加速工具后下载。

【Author】 KANG Pingbo1,WANG Wenjie2 (1. Graduate School of University of Science and Technology of China, Beijing 1 00039; 2. Information Science and Engineering School, Graduate School of Chinese Academ y of Sciences, Beijing 100039)

【机构】中国科技大学研究生院；中国科学院研究生院信息科学与工程学院北京100039； 100039北京；

【摘要】随着互联网的普及和发展，网络上的信息资源越来越丰富，它需要高效智能的工具来完成信息资源的采集。WWW上的网页抓取器，又称Robot. 讨论了抓取器与文本自动分类器相结合，对用户要求领域网页的收集。抓取器找到相关链接进行抓取，而避免对非相关链接的抓取。这样可以节省硬件、网络资源和提高抓取器的效率。更多还原

【Abstract】 With the rapid expansion of Internet and the continuous increase of the amount of information on WWW.It is desired to develop efficient and intelli gentized tools to do it.A WWW information discovery and collect tool is called a robot. This paper disusses the combination of the text automatic classification with robot . The goal is to selectively seek out pages that are relevant to a p re-defined set of topics. The robot finds the link that is likely to be most rel evant for the robot,and avoids irrelevant regions of the Web.This leads to signi ficant savings in network resource, and keeps robot more efficient.更多还原

【关键词】网页机器人；文本自动分类；向量空间模型；
【Key words】 Internet robot； Text automatic classification； Vector space model；

【文献出处】计算机工程 ,Computer Engineering , 编辑部邮箱 ,2003年21期

【分类号】TP393.09
【被引频次】8
【下载频次】161

知网节下载

节点文献中：

本文链接的文献网络图示:

本文的引文网络

节点文献