节点文献
基于集成分类器的用户属性预测研究
Research on demographic prediction based on ensemble classifiers
【摘要】 用户属性在个性化服务中具有重要的作用,利用手机数据进行用户属性预测逐渐成为新方向.利用手机应用类别均使用时长和应用类别个数,提出了基本属性与辅助属性的概念.首先对所有未标注样本的辅助属性离散化,将辅助属性基于类别的海灵格距离作为基本属性的特征权重,将基本属性与权重的乘积作为特征训练集成分类器中的各个基分类器,并引入随机森林中的带外样本准确率作为基分类器的权重,得到最终的分类结果.实验结果表明,本文所给出的集成分类器框架能够提高用户属性预测的效果.
【Abstract】 User attributes play an important role in personalized service.The prediction of the user’s property based on mobile phone data has gradually become a new direction.In this paper,The authors use two independent attributes:average daily usage time and number of application categories.The basic attribute and the concept of the auxiliary attribute are proposed.In this paper,firstly,the auxiliary attributes of all unlabeled samples are discretized by non-supervised method.And then calculate the Hellinger Distance of auxiliary property categories,which is the characteristic weight of the basic attribute.Input the basic attributes and the characteristic weight to the base classifier of the ensemble classifier training model,introducing random forest with out of sample accuracy as the base classifier weights,finally the authors get the final classification results.The experimental results show that the ensemble classifiers framework can improve the effect of user attribute prediction.
【Key words】 User attribute prediction; Smartphones; Discretization; Hellinger Distance; Feature weight;
- 【文献出处】 四川大学学报(自然科学版) ,Journal of Sichuan University(Natural Science Edition) , 编辑部邮箱 ,2017年06期
- 【分类号】TP301.6
- 【被引频次】3
- 【下载频次】93