节点文献

基于SELP的150b/s语音压缩编码算法

150 b /s speech compression coding algorithm based on SELP

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 常亮徐敬德崔慧娟唐昆

【Author】 CHANG Liang;XU Jingde;CUI Huijuan;TANG Kun;Tsinghua National Laboratory for Information Science and Technology,Department of Electronic Engineering,Tsinghua University;

【机构】 清华大学电子工程系,清华信息科学与技术国家实验室

【摘要】 针对极低速率语音压缩编码中比特资源有限,量化精度严重不足的问题,该文提出了一种新的编码策略——减少量化传输的内容,提高重要内容的量化精度。语音经过低通滤波器将最不重要的3~4 kHz频谱滤掉,并相应的将采样率从8 kHz降低到6 kHz,同时保持每帧样点数不变。这样各个参数的联合帧数就减少为原来的3/4,在比特数不变的情况下,可以有效地提高量化精度。另外,对于线性预测系数(linear prediction coefficient,LPC)而言,由于语音谱从原来的0~4 kHz变为现在的0~3 kHz,LPC的预测阶数可以从10降低为8,参数维数降低,量化精度可以得到进一步提高。在此框架下,结合子带清浊音(band-pass voicing,BPVC)解码端恢复算法,实现了高质量极低速率150 b/s语音压缩编码算法。与现有的两种150 b/s算法相比,客观平均意见得分(mean opinion score,MOS)分别提高了0.051和0.067,同时LPC参数的谱失真分别降低了0.09和0.16,改进了合成语音质量,提高了可懂度。

【Abstract】 A speech coding strategy is presented to improve the low quantization accuracy resulting from limited bit resources in ultra-low bit-rate speech coding algorithms. The algorithm improves the quantization accuracy by reducing the speech content that needs to be quantized and transmitted. First,the original speech goes through a low pass filter to filter out the least important 3 ~4 kHz speech spectrum and is then down-sampled from 8 kHz to 6 kHz,with the number of samples in each speech frame kept unchanged. The number of speech frames that can be jointly quantized can then be reduced to 3 /4 of the original method,which improves the quantization accuracy for the same bit-rate condition. The speech spectral reduction from 0 ~4 kHz to0 ~ 3 kHz reduces the prediction order of the linear prediction coefficients( LPC) from 10 to 8,so the total LPC parameter dimension is also reduced which further improves the quantization accuracy.Finally a high quality ultra-low bit-rate 150 b / s speech coding algorithm is developed with incorporates a band-pass voicing classification recovery algorithm. The algorithm increases the objective mean opinion score( MOS) by 0. 051 and 0. 067 compared to two 150b / s speech coding algorithms and decreases the spectral distortion by0. 09 and 0. 16, which suggests that both the quality and the intelligibility of the synthesized speech are improved.

【基金】 国家自然科学基金资助项目(60572081)
  • 【文献出处】 清华大学学报(自然科学版) ,Journal of Tsinghua University(Science and Technology) , 编辑部邮箱 ,2013年07期
  • 【分类号】TN912.32
  • 【被引频次】3
  • 【下载频次】146
节点文献中: 

本文链接的文献网络图示:

本文的引文网络