节点文献

TTS语音单元边界的自动切分

Automatic Segmentation for TTS Units

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 王丽娟曹志刚

【Author】 WANG Li-juan, CAO Zhi-gang (1 State Key Laboratory of Microwave and Digital Communications, Department of Electronic Engineering, Tsinghua University, Beijing 100084)

【机构】 清华大学电子工程系微波与数字通信技术国家重点实验室清华大学电子工程系微波与数字通信技术国家重点实验室 北京100084北京100084

【摘要】 语音单元边界的准确切分对基于波形拼接的语音合成系统至关重要。文章采用了两步切分方法,第一步中先由基于HMM模型的强制对齐方法得到初始的边界,在第二步中提出用基于前后音素的边界模型来修正初始边界。为解决训练数据不足的问题,提出用分类与衰退树将前后因素发音相近的边界模型进行聚类。这样可以根据训练数据的多少,动态调节边界模型的数目,以保证模型训练的可靠性。在对中文语音库的实验中,自动切分的准确度由78.7%提高到91.5%。

【Abstract】 Correct unit segmentation are, though laborsome, very crucial to the performance of a concatenation based TTS system. This paper suggests a two-step procedure for automatic unit segmentation, which coarsely segments speech data in the first step and refines segment boundaries in the secord step. A new Context-Dependent Boundary Model (CDBM) to describe the evolution across the segment boundary is proposed. To reduce manual segmentation, Classification and Regression Tree(CART) is used to structure the available data into a more efficient usage. Acoustically similar boundaries are clustered together and corresponding tied CDBM models are thus trained and used for boundary refinement during the secord step. After a series of experiments, the optimal CDBM parameters and the training conditions are found. The segmentation accuracy is raised from 78.7% to 91.5% in Mandarin syllable segmentation with about 1,000 manually segmented sentences as CDBM training data.

  • 【文献出处】 微电子学与计算机 ,Microelectronics & Computer , 编辑部邮箱 ,2005年12期
  • 【分类号】TN912.3
  • 【被引频次】6
  • 【下载频次】184
节点文献中: 

本文链接的文献网络图示:

本文的引文网络