节点文献

Customer Activity Sequence Classification for Debt Prevention in Social Security

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 张淮风操龙兵张成奇Hans Bohlscheid

【Author】 Huaifeng Zhang 1 Yanchang Zhao 2 , Longbing Cao 2 , Chengqi Zhang 2 , 1 Payment Reviews Branch, Business Integrity Division, Centrelink, Canberra, Australia 2 Centre for Quantum Computation and Intelligent Systems (QCIS), University of Technology, Sydney, Australia

【机构】 Payment Reviews Branch,Business Integrity Division,CentrelinkCentre for Quantum Computation and Intelligent Systems (QCIS),University of TechnologyPayment Reviews Branch, Business Integrity Division, Centrelink

【摘要】 From a data mining perspective, sequence classification is to build a classifier using frequent sequential patterns. However, mining for a complete set of sequential patterns on a large dataset can be extremely time-consuming and the large number of patterns discovered also makes the pattern selection and classifier building very time-consuming. The fact is that, in sequence classification, it is much more important to discover discriminative patterns than a complete pattern set. In this paper, we propose a novel hierarchical algorithm to build sequential classifiers using discriminative sequential patterns. Firstly, we mine for the sequential patterns which are the most strongly correlated to each target class. In this step, an aggressive strategy is employed to select a small set of sequential patterns. Secondly, pattern pruning and serial coverage test are done on the mined patterns. The patterns that pass the serial test are used to build the sub-classifier at the first level of the final classifier. And thirdly, the training samples that cannot be covered are fed back to the sequential pattern mining stage with updated parameters. This process continues until predefined interestingness measure thresholds are reached, or all samples are covered. The patterns generated in each loop form the sub-classifier at each level of the final classifier. Within this framework, the searching space can be reduced dramatically while a good classification performance is achieved. The proposed algorithm is tested in a real-world business application for debt prevention in social security area. The novel sequence classification algorithm shows the effectiveness and efficiency for predicting debt occurrences based on customer activity sequence data.

【Abstract】 From a data mining perspective, sequence classification is to build a classifier using frequent sequential patterns. However, mining for a complete set of sequential patterns on a large dataset can be extremely time-consuming and the large number of patterns discovered also makes the pattern selection and classifier building very time-consuming. The fact is that, in sequence classification, it is much more important to discover discriminative patterns than a complete pattern set. In this paper, we propose a novel hierarchical algorithm to build sequential classifiers using discriminative sequential patterns. Firstly, we mine for the sequential patterns which are the most strongly correlated to each target class. In this step, an aggressive strategy is employed to select a small set of sequential patterns. Secondly, pattern pruning and serial coverage test are done on the mined patterns. The patterns that pass the serial test are used to build the sub-classifier at the first level of the final classifier. And thirdly, the training samples that cannot be covered are fed back to the sequential pattern mining stage with updated parameters. This process continues until predefined interestingness measure thresholds are reached, or all samples are covered. The patterns generated in each loop form the sub-classifier at each level of the final classifier. Within this framework, the searching space can be reduced dramatically while a good classification performance is achieved. The proposed algorithm is tested in a real-world business application for debt prevention in social security area. The novel sequence classification algorithm shows the effectiveness and efficiency for predicting debt occurrences based on customer activity sequence data.

【基金】 supported by Australian Research Council Linkage Project under Grant No. LP0775041;the Early Career Researcher Grant under Grant No. 2007002448 from University of Technology, Sydney, Australia
  • 【文献出处】 Journal of Computer Science & Technology ,计算机科学技术学报(英文版) , 编辑部邮箱 ,2009年06期
  • 【分类号】F224;F274
  • 【下载频次】93
节点文献中: 

本文链接的文献网络图示:

本文的引文网络