节点文献

基于ERNIE-CAB-CNN的稀土专利文本分类模型

Text classification model of rare earths patents based on ERNE-CAB-CNN

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 廖列法石利娇

【Author】 Liao Liefa;Shi Lijiao;School of Information Engineering, Jiangxi University of Science and Technology;

【机构】 江西理工大学信息工程学院

【摘要】 针对稀土专利文本专业性强的特点以及现有的文本分类方法存在的不足,鉴于类别注意力在计算机视觉领域的广泛应用和取得的良好效果,提出了一种用于文本分类的类别注意力模块(Category Attention Module,CAB),并结合预训练模型ERNIE和卷积神经网络(Convolutional Neural Networks,CNN)构建了一个用于稀土专利文本分类的创新模型ERNIE-CAB-CNN。模型使用ERNIE对专利文本进行向量化,得到语义信息更加丰富的向量表示后,通过CAB为文本中各个类别的重要特征赋予较高权值,使模型可以更准确地区分不同类别的特征。最后用CNN进一步提取文本中其他关键局部特征,得到的最终文本向量表示用于分类。通过Patsnap专利数据库官方网站检索下载稀土专利数据构建数据集进行实验,实验结果表明,稀土专利文本分类模型ERNIE-CAB-CNN在测试集上分类的准确率、精确率、F1分数分别为82.68%、83.2%、82.06%,取得了良好的分类效果。

【Abstract】 In view of the strong specialization of rare earth patents and the shortcomings of existing classification methods, this paper proposes a Category Attention Block(CAB) for text classification in view of the wide application of category attention in the field of computer vision. Combined with ERNIE and Convolutional Neural Network(CNN), an innovative model ERNE-CABCNN for rare earth patent text classification is constructed. The model uses ERNIE to vectorize the patent text, and obtains the vector representation with richer semantic information. Then, it assigns higher weights to the key features of each category in the text through CAB, so that the model can distinguish different types of features more accurately. Finally, CNN is used to further extract other key local features in the text, and the resulting text vector representation is used for classification. Through the official website of Patsnap patent database, rare earth patent data are retrieved and downloaded to build a dataset for experiments.The experimental results show that the precision rate, accuracy rate and F1 score of the rare earths patent text classification model based on ERNE-CAB-CNN on the test set are 82.68%, 83.2% and 82.06%, respectively, achieving a good classification effect.

  • 【文献出处】 电子技术应用 ,Application of Electronic Technique , 编辑部邮箱 ,2025年01期
  • 【分类号】TP391.1;TP18;G255.53
  • 【下载频次】13
节点文献中: 

本文链接的文献网络图示:

本文的引文网络