节点文献

基于自适应特征融合的场景文本检测

Scene text detection based on adaptive feature fusion

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 马艺舒余艳梅陶青川

【Author】 Ma Yishu;Yu Yanmei;Tao Qingchuan;College of Electronics and Information Engineering, Sichuan University;

【通讯作者】 余艳梅;

【机构】 四川大学电子信息学院

【摘要】 基于深度学习的自然场景文本检测发展快速,其中基于分割的文本检测算法因其对多方向和弯曲文本检测效果好而备受关注。目前大多数基于分割的文本检测方法为了更加充分利用高层语义特征和底层细粒度特征,特征提取部分通常采用ResNet+特征金字塔(FPN)结构,特征融合部分多用concat或者add进行融合,但FPN存在的不同特征尺度不一致问题可能导致融合结果冲突,进而影响后续分割效果。因此,基于目前快速高效的DBnet网络,对其特征融合方式进行改进,提出了一种基于自适应特征融合的场景文本检测网络。在公开数据集Icdar2015和ICDAR 2017-MLT上的实验结果均表明:文本改进网络与经典的DBnet相比,准确率、召回率、F分数均有所提升,仅FPS稍有降低。

【Abstract】 Deep learning based natural scene text detection develops rapidly, among which segmentation based text detection algorithm has attracted much attention because of its good effect on multi-direction and curved text detection. At present, in order to make full use of high-level semantic features and low-level fine-grained features, most text detection methods based on segmentation usually adopt ResNet+ Feature Pyramid(FPN)structure for feature extraction, and CONCAT or ADD for feature fusion. However, the inconsistency of different feature scales in FPN may lead to the conflict of fusion results, which will affect the subsequent segmentation effect. Therefore, based on the current fast and efficient DBnet network, this paper improves its feature fusion method,and proposes a scene text detection network based on adaptive feature fusion. The experimental results on the public datasets Icdar2015 and ICDAR 2017-MLT show that compared with the classical DBnet, the accuracy, recall and F-score of the text improved network are improved, but the FPS is slightly reduced.

  • 【文献出处】 现代计算机 ,Modern Computer , 编辑部邮箱 ,2023年01期
  • 【分类号】TP391.41
  • 【下载频次】42
节点文献中: 

本文链接的文献网络图示:

本文的引文网络