节点文献
基于HMM模型的语音单元边界的自动切分
Automatic Phonetic Segmentation Using HMM Model
【摘要】 基于隐尔马可夫模型(HMM)的强制对齐方法被用于文语转换系统(TTS)语音单元边界切分。为提高切分准确性,本文对HMM模型的特征选择,模型参数和模型聚类进行优化。实验表明:12维静态M e l频率倒谱系数(M FCC)是最优的语音特征;HMM模型中的状态模型采用单高斯;对于特定说话人的HMM模型,使用分类与衰退树(CART)聚类生成的绑定状态模型个数在3 000左右最优。在英文语音库中音素边界切分的实验中,切分准确率从模型优化前的77.3%提高到85.4%。
【Abstract】 HMM models are widely used in the automatic speech recognition system to segment text-to-speech(TTS) units in the forced alignment mode.To improve the segmentation performance,the optimal acoustic feature selection and the training condition of the HMM model are discussed.Experimental results show that the static 12-D Mel-frequency cepstral coefficient(MFCC) feature is the optimal acoustic feature;the optimal number of Gaussian mixture components per state is 1;the optimal number of tied states after model clustering by the classification and regreession tree(CART) is about 3 000 for speaker-dependent tri-phone HMM models.With optimized parameters,the segmentation accuracy on English test corpus is increased from 77.3% to 85.4%.
【Key words】 acoustic unit boundary; automatic segmentation; HMM; text-to-speech system;
- 【文献出处】 数据采集与处理 ,Journal of Data Acquisition & Processing , 编辑部邮箱 ,2005年04期
- 【分类号】TN912.34
- 【被引频次】17
- 【下载频次】306