节点文献

最小方差无失真响应感知倒谱系数在说话人识别中的应用

Perceptual MVDR-based cepstral coefficients for speaker recognition

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 梁春燕张翔杨琳张建平颜永红

【Author】 LIANG Chunyan ZHANG Xiang YANG Lin ZHANG Jianping YAN Yonghong (Key Laboratory of Speech Acoustics and Content Understanding,Chinese Academy of Sciences, Institute of Acoustics,CAS Beijing 100190)

【机构】 中国科学院声学研究所 中国科学院语言声学与内容理解重点实验室

【摘要】 研究最小方差无失真响应感知倒谱系数在说话人识别中的应用。提取最小方差无失真响应感知倒谱系数,对其进行高斯混合模型建模并采用联合因子分析的方法来拟合高斯混合模型中的说话人和信道差异,在美国国家标准技术研究院2008年说话人识别评测核心测试集上分别对最小方差无失真响应感知倒谱系数和传统的Mel频率倒谱系数进行测试。结果显示,两种不同特征的系统性能相当,采用线性融合方法后,在不同测试集上的等错误率相对下降了7.6%~30.5%,最小检测错误代价相对下降了3.2%~21.2%。实验表明,最小方差无失真响应感知倒谱系数能有效应用于说话人识别中,且与传统的Mel频率倒谱系数存在一定程度的互补性。

【Abstract】 A new feature extraction technique named perceptual MVDR-based cepstral coefficients(PMCCs)is introduced into speaker recognition.PMCCs are extracted and modeled using Gaussiau Mixture Models(GMMs)for speaker recognition.In order to compensate for speaker and channel variability effects,joint factor analysis(JFA)is used.The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data.The experimental results show that the systems based on PMCCs can achieve comparable performance to those based on the conventional MFCCs.Besides,the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone,reducing equal error rate(EER)by the factor between 7.6%and 30.5%as well as minimum detect cost function(minDCF)by the factor between 3.2%and 21.2%on different test sets.The results indicate that PMCCs can be effectively applied in speaker recognition and they are complementary with MFCCs to some extent.

【基金】 国家自然科学基金资助项目(10925419,90920302,10874203,60875014,61072124,11074275)
  • 【分类号】TN912.34
  • 【被引频次】9
  • 【下载频次】200
节点文献中: 

本文链接的文献网络图示:

本文的引文网络