节点文献
ESPM——频繁子树挖掘算法
ESPM—An Algorithm to Mine Frequent Subtrees
【摘要】 随着互联网的发展 ,频繁模式的挖掘由频繁项集扩展到结构化数据 :树和图 在这些结构上的挖掘工作被应用于更为复杂的领域 ,比如生物信息学、网络日志和XML文档 提出了一个新颖的算法 :ESPM ,以挖掘有序标号树中的频繁子树 不同于以往的工作 ,把树同构的判断工作放到了算法的晚期 ,从而减少了整个挖掘过程的时间开销 人工数据集和真实数据集上的实验都证明ESPM相较于其他算法的优越性 还提出了一些可能的改进
【Abstract】 With the development of Internet, frequent pattern mining generalizes to more complex patterns like tree mining and graph mining Such applications arise in complex domains like Bioinformatics, web mining, etc In this paper a novel algorithm, named ESPM (expanded subtree pattern miner), is presented to discover frequent subtrees from ordered labeled trees Unlike previous works, the work of distinguishing isomorphism is left in the later part of the algorithm, which minimizes the cost of the whole process The performance of the algorithm is evaluated with experiments on synthetic and real datasets The experimental result shows that the algorithm can do the job well and is better than previous algorithms Finally the potential improvement of ESPM is mentioned
- 【文献出处】 计算机研究与发展 ,Journal of Computer Research and Development , 编辑部邮箱 ,2004年10期
- 【分类号】TP311.13
- 【被引频次】48
- 【下载频次】349