节点文献
一种结合众包的排序学习算法
A RANK LEARNING ALGORITHM COMBINED WITH CROWDSOURCING
【摘要】 针对有监督排序学习所需带标记训练数据集不易获得的情况,引入众包这种新型大众网络聚集模式来完成标注工作,为解决排序学习所需大量训练数据集标注工作耗时耗力的难题提供了新的思路。首先介绍了众包标注方法,着重提出两种个人分类器模型来解决众包结果质量控制问题,同时考虑标注者能力和众包任务的难度这两个影响众包质量的因素。再基于得到的训练集使用RankingSVM进行排序学习并在微软OHSUMED数据集上衡量了该方法在NDCG@n评价准则下的性能。实验结果表明该众包标注方法能够达到95%以上的正确率,所得排序模型的性能基本和RankingSVM算法持平,从而验证了众包应用于排序学习的可行性和优越性。
【Abstract】 Aiming at the situation that it is difficult to obtain the large scale training data set with labels for supervised learning to rank,this paper introduces crowdsourcing,a new public network aggregation model,to complete the labeling work. It provides a new way to solve the problem of time consuming and labor consuming in the training dataset. We first introduce the crowdsourcing labeling method,and put forward two personal classifiers model to solve the problem of crowdsourcing quality control. At the same time,we consider the two factors that affect the quality of the crowdsourcing,including the marker ability and the difficulty of crowdsourcing tasks. Ranking SVM is used to rank learning based on the training set,and the performance of the method is evaluated on Microsoft OHSUMED data set under the NDCG @ n criterion. The results show that the proposed crowdsourcing labeling method can achieve more than 95% correctness,and the performance of the ranking model is equal to Ranking SVM algorithm,which verifies the feasibility and superiority of crowdsourcing in ranking learning.
【Key words】 Learning to rank; Crowdsourcing; Crowdsourcing quality control; Ranking SVM;
- 【文献出处】 计算机应用与软件 ,Computer Applications and Software , 编辑部邮箱 ,2017年06期
- 【分类号】TP181
- 【被引频次】2
- 【下载频次】254