节点文献
一种基于频繁模式的时间序列分类框架
A Frequent Pattern Based Time Series Classification Framework
【摘要】 如何提取和选择时间序列的特征是时间序列分类领域两个重要的问题。该文提出MNOE(Mining Non-Overlap Episode)算法计算时间序列中的非重叠频繁模式,并将其作为时间序列特征。基于这些非重叠频繁模式,该文提出EGMAMC(Episode Generated Mixedmemory Aggregation Markov Chain)模型描述时间序列。根据似然比检验原理,从理论上推导出频繁模式在时间序列中出现的次数和EGMAMC模型是否能显著描述时间序列之间的关系;根据信息增益定义,选择能显著描述时间序列的频繁模式作为时间序列特征输入分类模型。在UCI(University of California Irvine)公共数据集和实际智能楼宇数据集上的实验表明,选择频繁模式作为特征进行分类的准确率、召回率和F-Measure均优于不选择频繁模式作为特征的分类结果。高效的计算和有效的选择非重叠频繁模式作为时间序列特征有助于提高时间序列分类模型的各项评价指标。
【Abstract】 How to extract and select features from time series are two important topics in time series classification.In this paper,a MNOE(Mining Non-Overlap Episode) algorithm is presented to find non-overlap frequent patterns in time series and these non-overlap frequent patterns are considered as features of the time series.Based on these non-overlap episodes,an EGMAMC(Episode Generated Mixed memory Aggregation Markov Chain) model is presented to describe time series.According to the principle of likelihood ratio test,the connection between the support of episode and whether EGMAMC could describe the time series significantly is induced.Based on the definition of information gain,significant frequent patterns are selected as the features of time series for classification.The experiments on UCI(University of California Irvine) datasets and smart building datasets demonstrate that the classification model trained with selecting significant frequent patterns as features outperforms the one trained without selecting them on precision,recall and F-Measure.The time series classification models can be improved by efficiently extracting and effectively selecting non-overlap frequent patterns as features of time series.
【Key words】 Time series classification; Frequent pattern mining; Smart building;
- 【文献出处】 电子与信息学报 ,Journal of Electronics & Information Technology , 编辑部邮箱 ,2010年02期
- 【分类号】TP311.13
- 【被引频次】8
- 【下载频次】314