节点文献

基于混合式迁移学习的命名实体识别算法

NAMED ENTITY RECOGNITION ALGORITHM BASED ON MIXED TRANSFER LEARNING

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 余肖生张合欢陈鹏

【Author】 Yu Xiaosheng;Zhang Hehuan;Chen Peng;College of Computer and Information, China Three Gorges University;

【机构】 三峡大学计算机与信息学院

【摘要】 针对命名实体识别领域中大量标注数据难于获取而带来的问题,提出基于混合式迁移学习的命名实体识别算法——MT-NER。利用样本之间的距离作为权衡样本相似性的标准,进行样本迁移以扩充目标域样本;利用模型迁移建立带有finetune的新命名实体识别网络结构,用扩充后的目标域数据集来训练网络。以医疗领域为例的实验结果分析表明,MT-NER算法在小样本数据中的实体识别效果最佳,精度达到93.31%,召回率达到89.5%,F1值达到0.931 7,与BiLSTM-CRF模型相比分别提升了6.33百分点、3.65百分点和0.089 1。

【Abstract】 In the field of named entity recognition, it is difficult to obtain a large number of labeled data. To solve this problem, this paper proposes a named entity recognition algorithm based on mixed transfer learning named MT-NER. The distance between the samples was used as the criterion to balance the similarity of the samples, and the instances-based transfer learning was carried out to expand the target domain samples. A new named entity recognition network structure with finetune was established by the models-based transfer learning, and the expanded target domain data set was used to train the network. Taking the medical field as an example, experiments show that MT-NER algorithm has the best effect in entity recognition in small sample data, with an accuracy of 93.31%, a recall rate of 89.5% and a F1 value of 0.931 7. Compared with the BiLSTM-CRF model, the accuracy, recall rate and F1 value of MT-NER are improved by 6.33, 3.65 and 8.91 percentage points.

【基金】 国家重点研发计划项目(2016YFC0802500)
  • 【文献出处】 计算机应用与软件 ,Computer Applications and Software , 编辑部邮箱 ,2024年08期
  • 【分类号】TP391.1
  • 【下载频次】7
节点文献中: 

本文链接的文献网络图示:

本文的引文网络