节点文献

基于机器学习的高校图书馆高利用率教参书识别研究

Research on the Identification of University Library’s Frequently Utilizated Reference Books Based on Machine Learning Methods

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 徐梦宇成伟华张计龙

【Author】 Xu Mengyu;Cheng Weihua;Zhang Jilong;Fudan University Library;

【通讯作者】 张计龙;

【机构】 复旦大学图书馆

【摘要】 教参书保障与支持服务是高校图书馆对教学活动提供支持服务的重要任务之一。从馆藏图书中识别出高利用率教参书,是后续深入开展教参书保障支持和精准服务的基础和前提条件。基于机器学习方法特点和海量用户动态行为数据,本文首先从借阅人群、借阅时间、利用率三个方面选取了七个维度的核心特征指标对高利用率教参书进行特征指标建模。随后,基于四种主流、典型机器学习算法:支持向量机、决策树、随机森林以及XGBoost,进行了对比实验。实验表现最好的XGBoost算法的查准率(Precision)、查全率(Recall)和F1分数(F1-score)分别为0.849、0.906、0.876,取得了较好的识别结果。

【Abstract】 The support services related to reference books is one important task of university libraries. The prerequisite of carrying out that is to identify the frequently utilized reference books. Based on machine learning methods and massive dynamic data about user behaviors, this paper first constructs seven-dimensional feature sets about borrower types, borrowing time and utilization rate to establish the identification model of frequently utilized reference books. In the experiment session, Support Vector Machine, Decision Tree, Random Forest and XGBoost algorithm are selected for comparison. Among them, XGBoost algorithm yields the best experimental performance. To be specific, its precision, recall and F1-score are 0.849, 0.906, 0.876. In summary, we obtain good identification result in this paper.

  • 【文献出处】 图书馆杂志 ,Library Journal , 编辑部邮箱 ,2020年11期
  • 【分类号】G258.6;G252
  • 【被引频次】4
  • 【下载频次】424
节点文献中: 

本文链接的文献网络图示:

本文的引文网络