节点文献

面向不均衡小样本训练集的改进Boosting算法

Improved Boosting algorithm for small and imbalanced training sets

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 程有龙庄连生李斌庄镇泉

【Author】 CHENG Youlong1,ZHUANG Liansheng1,LI Bin1,2,ZHUANG Zhenquan2(1.MOE-Microsoft Key Laboratory of Multimedia Computing and Communication,Hefei 230027,China;2.Department of Electronic Science and Technology,University of Science and Technology of China,Hefei 230027,China)

【机构】 微软-教育部多媒体计算与通信联合实验室中国科学技术大学电子科学与技术系

【摘要】 传统的Boosting算法训练出的分类器常会出现过拟合和向多数类偏移.为此,提出一种基于自适应样本注入和特征置换的Boosting学习算法,通过在训练过程中加入人工合成样本,逐渐平衡训练集,并通过合成的样本对分类器学习进行扰动,使分类器选择更多有效的特征,提高了分类器的泛化能力.最后,在两类和多类图片分类问题上对该算法的有效性进行了考察,实验结果表明,该算法能够在样本数很少,且正负样本数量极不均衡的情况下,有效提高booting算法的泛化能力.

【Abstract】 Traditional Boosting algorithms tend to overfit and be biased towards the majority class on small and imbalanced training sets.To address this issue,an improved Boosting learning algorithm with adaptive sample injecting and feature knock out was proposed.In the training process,synthetic samples were appended to the original training set to rebalance it and disturb and enhance its generalization ability.The methodwas tested on both two-class and multi-class image classification problems.Experiment results show that when the number of training samples is small,and the distribution of training set is imbalanced,the proposed method can enhance the generalization performance of Boosting algorithms effectively.

【基金】 国家自然科学基金(U0835002);教育部-微软重点实验室研究基金(07122808)资助
  • 【文献出处】 中国科学技术大学学报 ,Journal of University of Science and Technology of China , 编辑部邮箱 ,2010年02期
  • 【分类号】TP18
  • 【被引频次】5
  • 【下载频次】376
节点文献中: 

本文链接的文献网络图示:

本文的引文网络