节点文献
一种基于Transformer模型的特征增强算法及其应用研究
A study on a feature enhancement algorithm based on Transformer model and its application
【摘要】 Transformer模型在自动语音识别(ASR)任务中展现出优秀的性能,但在特征提取方面存在两个问题:一是模型集中于全局特征交互信息提取,忽略了其他有用的特征信息,如局部特征交互信息;二是模型对低层特征交互信息的利用不够充分。为了解决这两个问题,提出了卷积线性映射(CMLP)模块以强化局部特征交互,并设计低层特征融合(LF)模块来融合高低层特征。通过整合这些模块,构建了CLformer模型。在两个中文普通话数据集(Aishell-1和HKUST)上进行实验,结果表明,CLformer显著提升了模型性能,在Aishell-1上较基线提高0.3%,在HKUST上提高0.5%。
【Abstract】 The Transformer model demonstrates excellent performance in the task of automatic speech recognition(ASR), but there is still room for improvement in feature extraction. This study identifies two main issues with the model: first, it focuses on extracting global feature interactions, overlooking other useful features such as local feature interactions; second, it does not fully utilize low-level feature interactions. To address these issues and enhance the model’s performance in ASR tasks, we propose a Convolutional Linear Mapping(CMLP)module to enhance local feature interactions and a Low-level Feature Fusion(LF)module to integrate high-level and low-level features. By integrating these modules, we construct the CLformer model. Experimental results on two Chinese Mandarin datasets(Aishell-1 and HKUST)demonstrate that CLformer significantly improves model performance: by 0.3% on Aishell-1 and 0.5% on HKUST compared to the baseline. This validates the effectiveness of our optimization strategy.
【Key words】 Transformer model; automatic speech recognition; feature fusion; local feature; global feature;
- 【文献出处】 佛山科学技术学院学报(自然科学版) ,Journal of Foshan University(Natural Science Edition) , 编辑部邮箱 ,2024年03期
- 【分类号】TP18;TN912.34
- 【下载频次】44