节点文献
声源DOA估计中的TDOA-DOA映射方法研究
Research on The Mapping of TDOA to DOA for Sound Source DOA Estimation
【作者】 张峰;
【导师】 陈华伟;
【作者基本信息】 南京航空航天大学 , 通信与信息系统, 2014, 硕士
【摘要】 声源波达方向(Direction Of Arrival,DOA)估计作为麦克风阵列信号处理中的一项关键技术,在视频会议系统、故障检测、医疗诊断、军事等许多领域都有广泛应用。基于多通道到达时间差(Time Differences Of Arrival,TDOA)的方法是声源DOA估计中的一种重要方法。然而当前研究工作主要集中在TDOA获取,而对TDOA-DOA映射方法研究较少。基于最小二乘支持向量回归机(Least Squares Support Vector Regression,LS-SVR)的TDOA-DOA映射方法有较好的声源DOA估计效果,但其研究并不全面。本文针对基于LS-SVR的TDOA-DOA映射方法,从LS-SVR中的核函数选取、多核LS-SVR构造以及稀疏化分析等方面进行了深入研究。此外,本文提出一种基于稀疏表示理论的无需调节参数的TDOA-DOA映射方法。本文的主要工作有:1)由于不同核函数具有不同的映射性能,因而本文研究了径向基核、多项式核以及线性核这三种常见核函数构造的LS-SVR在混响和噪声环境中的声源DOA估计性能,并与最小二乘映射方式进行了比较。研究结果表明,采用径向基核函数具有更高的估计性能。2)针对估计时延在混响较为严重的环境中出现离群值的问题,本文根据TDOA-DOA的映射特点,提出一种基于中值滤波的TDOA处理方法以消除离群值。研究结果表明,采用该方法后,在混响较为严重的环境中声源DOA映射性能得到了有效提升。3)为了进一步提升声源DOA估计性能,本文结合多核学习理论以及K-means聚类算法,提出了基于聚类方法的多核LS-SVR映射方法。仿真结果表明,多核LS-SVR的性能要优于单核LS-SVR以及最小二乘法;一般情况下,核的个数越多,多核LS-SVR的性能越好,并且混响时间越大,多核的性能优势体现得越明显。4)针对LS-SVR映射方法中训练集存在冗余这一问题,本文将基于最小支持权重的剪枝稀疏方法运用到声源DOA估计中,分别对单核和多核LS-SVR映射方法进行了稀疏化分析。研究结果表明,与基本LS-SVR相比,稀疏LS-SVR方法不仅能保持良好的DOA估计性能,而且有效减小了测试时的运算量。5)提出了一种基于稀疏表示理论的无需调节参数的TDOA-DOA映射方法。在此基础上,为进一步降低运算量,本文应用一种双步网格搜索方法来匹配TDOA向量和数据字典。研究结果表明,与传统的无需调节参数的映射方法相比,该算法存在一定的性能优势。
【Abstract】 As one of the key technologies in microphone array signal processing, the sound sources direction of arrival(DOA) estimation has been widely used in many fields, such as video conference system, fault detection, medical diagnosis, and military. The technique based on time differences of arrival(TDOAs) with multiple channels is an important method for the sound sources DOA estimation. While researchers focus on the acquisition of TDOAs, rather than the mapping of TDOAs to DOA. The mapping approach based on least squares support vector regression(LS-SVR) has shown its good performance, where its research still less of comprehensiveness. This paper focuses on the mapping of TDOAs to DOA based on LS-SVR, studies the choice of kernel functions, the construction of multi kernel LS-SVR and the sparsification analysis on support vectors. Moreover, we have proposed a tuning parameter-free mapping approach for TDOA-based sound source DOA estimation via sparse representation. The main jobs of this paper are:1) For the performance of different kernel functions are various, this paper focuses on the mapping construction of LS-SVR with radial basis kernel, polynomial kernel and linear kernel function, which influence the sound sources DOA estimation in reverberant and noise environment, and makes a comparison with least squares method. The research results show that the radial basis kernel has better estimation performance.2) Aiming at the problem that the outliers of TDOAs appear in reverberant environment, this paper proposes a TDOA processing approach based on median filtering, according to the characteristic of TDOAs to DOA mapping, to eliminate outliers. The research results shows that the sound source DOA mapping performance has been promoted effectively in reverberant environment, after using this approach.3) To further improve the sound source DOA mapping performance, this paper combines the theory of multi kernel learning and K-means clustering method, proposing a multi kernel LS-SVR mapping approach based on K-means clustering idea. The research results shows that the proposed mapping approach has better performance than single kernel LS-SVR and least squares method. In general, the more kernels the multi kernel LS-SVR owns, the better the performance it has, and the advantage of its performance shown more obviously following the increase of reverberant time.4) Aiming at the problem that the training set of LS-SVR mapping approach has some redundancy, this paper applies the sparse approximation based on pruning the minimum support values using LS-SVR to sound sources DOA estimation, and analyzes the sparsification of single kernel and multi kernel LS-SVR mapping approaches. The research results show that comparing with the basic LS-SVR approach, the sparse LS-SVR not only keeps good performance of sound sources DOA estimation, but also reduces the calculation amount of test effectively.5) This paper has proposed a tuning parameter-free mapping approach for TDOA-based sound sources DOA estimation via sparse representation. To further reduce the amount of calculation, this paper applies a two-step grid searching approach to match the TDOAs with data dictionary. The research results show that the proposed approach has some advantages over traditional tuning parameter-free mapping approach.
【Key words】 microphone array; sound sources DOA estimation; time delay estimation; LS-SVR; multi kernel learning; sparse representation;