节点文献
利用正则表达式解析新闻网页的算法研究
Study on Algorithm of Analyze News Web Pages by Exploiting the Regular Expression
【摘要】 分析了新闻网页的结构特征,提出了一种利用正则表达式来解析新闻网页的算法,避开了网页清洗算法不易实现的缺点,并对该算法的速度和准确性进行了测评,给出了测评结果。
【Abstract】 This paper discusses the characteristics of the news web pages, and propounds a algorithm of exploiting regular expression to analyze news web pages, which avoids the disadvantage that it is hard to realize the algorithm of analyzing news web pages. At the same time, it has tested the speed and the accuracy of this algorithm, and then gives the outcome.
- 【文献出处】 农业图书情报学刊 ,Journal of Library and Information Sciences In Agriculture , 编辑部邮箱 ,2005年04期
- 【分类号】G210.7
- 【被引频次】24
- 【下载频次】387