节点文献

RQIC:一种高效时序相似搜索算法

RQIC:An Efficient Similarity Searching Algorithm on Time Series

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 蒋涛冯玉才朱虹李国徽

【Author】 Jiang Tao,Feng Yucai,Zhu Hong,and Li Guohui(College of Computer Science & Technology,Huazhong University of Science and Technology,Wuhan 430074)

【机构】 华中科技大学计算机科学与技术学院

【摘要】 索引大规模时序数据库是高效时序搜索中的关键问题.提出了一种新颖的索引方案RQI,它包括3种过滤策略:即first-k过滤、索引低边界和上边界以及三角不等式修剪.基本的思想为首先运用Haar小波变换计算每个时序的小波系数,利用前面的k个小波系数形成一个最小边界矩阵,以利用点过滤方法;然后将预先计算每个时序的低边界特征和上边界特征存放到索引当中;最后采用三角不等式来修剪不相似的序列并确保没有漏报.同时提出了一种新的低边界距离函数SLBS和聚类算法CSA.通过CSA可保持索引良好的聚类特征以提高点过滤方法的效率,从而引入了一种更好的算法RQIC.在合成数据集和实时数据集的大量对比实验表明,RQIC是有效的且具备较高的查询效率.

【Abstract】 Indexing large time series databases is crucial for efficient search of time series queries.An index scheme RQI(range query based on index) is introduced,which includes three filtering methods:first-k filtering,indexing lower bounding and upper bounding as well as triangle inequality pruning.The basic idea is calculating wavelet coefficients for each time series based on Haar wavelet transform.The first k coefficients are used to form a MBR(minimal bounding rectangle).Thus,the point filtering method can be used.In addition,lower bounding and upper bounding features of each time series are calculated in advance,which are stored into index structure.Finally,triangle inequality pruning method is adopted by calculating the distance between time series beforehand.Moreover,a new lower bounding distance function SLBS(symmetrical lower bounding based on segment) and a novel clustering algorithm CSA(clustering based on segment approximation) are proposed.The aim is to further improve the search efficiency of point filtering method by keeping a good clustering trait of index structure,thereby obtaining a better algorithm RQIC(range query based on Index and clustering).A lot of compared experiments are conducted over both synthetic and real data sets for RQIC.The results show that RQIC provides perfect pruning ability and high search efficiency.

【关键词】 数据挖掘算法索引聚类时间序列相似搜索
【Key words】 data miningalgorithmindexingclusteringtime seriessimilarity search
【基金】 国家“八六三”高技术研究发展计划基金项目(2006AA01Z430,2007AA01Z309)~~
  • 【文献出处】 计算机研究与发展 ,Journal of Computer Research and Development , 编辑部邮箱 ,2009年05期
  • 【分类号】TP311.13
  • 【被引频次】4
  • 【下载频次】228
节点文献中: 

本文链接的文献网络图示:

本文的引文网络