节点文献
数据挖掘技术在提高Web用户访问速度上的应用研究
The Application Research of Data Mining Technique on Raising Accessing Speed of Web User
【作者】 胡永晖;
【导师】 孟志青;
【作者基本信息】 湘潭大学 , 计算机应用技术, 2005, 硕士
【摘要】 WWW 是一个开放的全球性的资源,而数据挖掘技术是从大量的数据中提取出隐藏在数据之后的有用的信息。因此,采用数据挖掘技术从WWW 智能地、自动地提取出有价值的知识,提高WWW的效率,具有十分重要的现实意义和广泛的应用前景。本文首先指出了WWW 上存在的网络设施瓶颈,并引出了本文的课题。随后,在研究WWW、数据挖掘、Web 数据挖掘、关联规则和时态关联规则的基础上,提出了一种采用关联规则和时态关联规则提高Web 用户访问速度的方法,即通过对服务器的访问日志进行挖掘,得到用户访问序列的(时态)关联规则,将这些规则应用到客户浏览,把用户随后最有可能访问的网页预先传送到用户本地来提高访问速度。文中详细论述了对服务器日志文件进行预处理的过程、对用户访问数据集进行关联规则挖掘的过程、利用得到的关联规则实现服务器预存取的过程,研究了相同网页和不同网页两种情况下时态关联规则提高网页访问速度的问题,并给出了具体的算法。实验表明,预存取系统对用户访问Web 页面的过程具有较高的优化能力,时态关联规则可以大概预先知道访问下一个网页的时间,可以较为有效地提高Web 用户访问网站的速度。最后,对全文进行了总结。
【Abstract】 The World Wide Web is a distributed global information resource. Data Mining is to automatically discovery non-obvious,potentially useful and previously unknown information in a large of data sources. So, valuable information can be obtained by data mining techniques intelligently and automatically. It is very significant to improve efficiency of the WWW and application values. In this paper,we first point out the existent network establishment bottle-neck of WWW and then fetch out the task of this paper. After the introduction of WWW,Data Mining,Web Data Mining, association rules and temporal association rules,we propose a new method to raise the speed of network with association rules and temporal association rules. By mining association rules from access logs,we can obtain the (temporal) association rules about the sequence of user transaction. Once the rules have been identified,the server can pre-fetch documents to the client. And the speed of network becomes “rapid”. We discuss in detail the process of pretreatment of web log file, association rules mining of user session data set, actualizing page pre-fetch in server with association rules. Next, we concentrate raising the speed of network by temporal association rules in the same web and different web. We give the algorithm and the experiment, which shows that the pre-fetch system has upper optimize ability when a user visits a web page after mining temporal association rules from access logs and obtaining the space of time a user visit the next page. By this way,we can raise the speed of network effectively. Finally, we give a conclusion.
- 【网络出版投稿人】 湘潭大学 【网络出版年期】2006年 05期
- 【分类号】TP393.092
- 【被引频次】4
- 【下载频次】521