节点文献
神经网络分类算法驱动的口译教学语料难度分级系统
A Neural Networks-based System for Automatic Determination of Speech Difficulty Level in Interpreter Training
【摘要】 教学语料设计是口译教学的重要组成部分。学界对口译教材建设的现状和语料难度影响因素有一定的研究,但影响难度的变量众多且关系复杂,难度划分主要依赖于专家经验,缺乏统一、可靠的分级标准。从机器学习的角度看,教学语料的难度分级可视为一个分类问题。本文使用RoBERTa预训练模型构建神经网络系统,通过对286段语料进行人工标注再使用数据增强和知识蒸馏技术扩充数据集的方法,开发了一套基于机器学习的教学语料难度自动分级系统。该系统可以根据输入的源语语料自动输出其难度分级结果,帮助教师和学习者更好地开展课堂教学和自主学习。
【Abstract】 Speech difficulty level of materials for training interpreters is dependent on a myriad of complicated factors. While the established practice for determining the level is to rely on expert trainers’ opinions so far, what is involved is actually a pattern-cognition problem solvable by using machine learning. In this paper, we propose a neural network architecture for automatically determining the level of speech difficulty in this context. Based on the Ro BERTa pre-training model, our system was trained using 286 tagged data points and additional training sets generated through knowledge distilling and data augmentation. Its application promises to lend much help to trainers and students of interpreting in their efforts to select suitable training materials for classroom activities and after-class practice.
【Key words】 textbooks; pedagogical corpus; speech difficulty level; pattern recognition algorithm; Ro BERTa; data augmentation;
- 【文献出处】 中国翻译 ,Chinese Translators Journal , 编辑部邮箱 ,2023年03期
- 【分类号】H059-4;G643
- 【下载频次】51