节点文献
基于深度学习的自然资源政策文本分类研究
Research on classification of natural resources policy text based on deep learning
【摘要】 政策文本分类是一项涉及自然语言处理(NLP)、机器学习、政策解析等多领域的综合性技术,在政策管理、研究以及信息服务等方面有重要应用。首先,针对目前政策文本领域公共资源较少的问题,提出结合领域知识和NLP构建政策文本分类数据集的半自动化方法,构建了句子级自然资源政策文本分类数据集;其次,挖掘政策文本自身特点,提出基于深度学习的标题信息自适应增强政策文本分类方法,并在现有主流深度学习模型上进行扩展应用;最后,在自然资源政策文本分类数据集上的实验表明,应用该方法后,5个常用深度学习分类模型的准确率获得了3%以上提升,宏平均F1值获得了5%以上提升。
【Abstract】 Policy text classification is a comprehensive technology involving natural language processing(NLP), machine learning, policy analysis and other fields, which can be applied to policy management, research, information service, etc. Firstly, aiming at the problem that there are few public datasets in the field of policy text at present, a semi-automatic method of combining domain knowledge and NLP to construct policy text classification dataset is proposed, and a sentence-level natural resource policy text classification dataset is constructed. Secondly, taking advantage of the characteristics of policy texts, a deep learning-based title adaptive enhancement policy text classification method is proposed, which is applied to the existing mainstream deep learning models. Finally, extensive experiments on the natural resource policy text classification dataset show that after adding this method, the accuracy of five commonly used deep learning classification models is improved by more than 3%, and the macro-average F1score is improved by more than 5%.
【Key words】 policy text; text classification; deep learning; natural resources; delay decision; dataset construction;
- 【文献出处】 高技术通讯 ,Chinese High Technology Letters , 编辑部邮箱 ,2023年07期
- 【分类号】P96;TP391.1
- 【下载频次】21