节点文献
室内麦克风阵列声源定位算法研究和实现
【作者】 周峰;
【导师】 陈雄;
【作者基本信息】 复旦大学 , 电路与系统, 2009, 硕士
【摘要】 随着多媒体技术的进一步发展,语音在接收和处理信息方面的应用重要性得到了广泛的关注,语音识别,语音增强,目标声源的定位等应用方兴未艾,而声源的定位是实现语音增强,语音识别的前提和基础,基于麦克风阵列的声源定位技术由于其广阔的应用前景得到了广泛的关注。在本论文中,我们致力于研究室内环境下的基于麦克风阵列的声源定位系统的研究和实现工作。在室内环境中,由于房间回响和背景的噪声的影响,麦克风阵列的性能受到了极大的限制,再者,由于硬件条件的限制,一些定位算法所需的计算量大,限制了其实时的应用。本文针对上述的两个问题,提出了自己的解决方案,并在实际环境中搭建了系统,采集了数据,验证了提出算法的有效性。鉴于预处理和语音活动检测对声源定位系统后续处理的重要性,首先我们介绍了滤波,加窗预处理操作,解释了简单但有效的语音活动检测算法:能量法和过零率法。基于时间到达差的(TDOA)的GCC-LMS两步定位方法由于其计算量小,因此在实际的系统中得到了广泛的应用。第一步的时延估计直接决定了定位的性能,我们在实际中发现,由于采集卡同步噪声的原因,虚假的零峰值导致了错误的时延估计,因此,我们对互功率谱滤波解决此问题。为了尽可能的削弱回响和噪声对时延估计的影响,我们采取了几个措施,1)减小搜索空间,按照麦克风的距离设置搜索区间,2)根据信噪比动态的调整权重函数的大小,仿真结果表明了这些措施的采用提高了在高噪声和回响环境中时延估计的准确度。另外,我们针对了采样率低的情况提出了对互相关函数的插值,提高了时延估计的空间分辨率。在TDOA的第二步中,目前广泛采用了最小二乘法(LMS),为了增强系统的可靠和稳健性,我们提出了一种轮流使用麦克风为参考麦克风,剔除误差较大的位置估计,对剩下的取平均,提高了位置估计的鲁棒性。另外,我们介绍了基于双曲线定位的平面几何法,并和最小二乘法(LMS)做了简要的对比。目前另一种广泛使用的声源定位方法是可控波束法(SRP),一步定位的可控波束法相对与两步定位的TDOA法,推迟了做决策的阶段,综合了所有的麦克风信息,具有更强的抗回响和噪声能力,而与之相随的代价是计算量大,难于实时处理。基于随机区域收缩(SRP-SRC)的可控波束法避免了全局的空间搜索,极大的减少了计算量,在论文中,我们改进了SRP-SRC方法,称之为SRP-RSRC方法,1)引入了塑形函数,提高了能量峰值和周围环境的对比度,2)设定一能量阈值,从能量大于此阈值的空间中选择能量最大的点,使SRP-RSRC算法具有更快的收敛速度,更小的计算量。另外,我们将卡尔曼滤波和预测和SRP-RSRC联合使用,使系统的跟踪性能更为稳定。最后,我们介绍了我们在室内环境中实现的系统的基本结构和软硬件的情况,并用实际中采集的数据对基于TDOA的两步GCC-LMS定位法和SRP-SRC,SRP-RSRC做了对比分析。
【Abstract】 With the developments of multimedia technology,the importance of using voice in receiving and dealing with information has aroused wild concern. The applications of voice recognition,voice enhancement and acoustic source localization are still under development,while acoustic source localization is the basis and prerequisite. Acoustic source localization based on microphone array has received more and more attention because it has potential for a broad range of applications.In our thesis,we dedicate ourselves to research and realization of acoustic source localization system based on microphone array in room environment. The performance of microphone array is affected seriously due to room reverberation and environment noise. Another question,restrained by harware conditions,some localization algorithm is computational burdensome,thus can not be applied in real time applications. To deal with these two crucial questions,we put forward our own solutions,and we designed a real acoustic source localization system,using which,we collect voice data,validate our proposed algorithm and solution.In light of the importance of preprocess and voice activity detection to the following treatments in acoustic localization system,first,we give an introduction of filtering,windows,explained two simple but effective voice activity detection algorithms: energy discriminator and zero-crossing.Two steps method,namely,GCC-LMS based on TDOA is widely used in realistic system due to its low computational cost.Time delay estimation in the first step will determine the performance of localization.We find out that false peak at zero point will lead to false time delay estimation due to synchronous noise,in order to solve this problem,we filter the cross spectrum.To restrain effects of reverberation and noise,we take the following measures, 1 )reduce the search region according to the distances of microphones,2)adjust the weight factor dynamically according to signal noise ratio. Simulation results show that the accuracy of time delay estimation is improved in environment with high noise and reverberation .In addition,we propose a method to interpolate the cross correlation function in low sampling rate,which raise the space resolution.In the second step of TDOA,LMS is widely used.In order to make our system more reliable and more robust ,we use every microphone as reference microphone in turn, then we get a couple of source locations.Getting rid of some source locations with large error, averaging the remaining source locations,we get a more reasonable and trusty source location. Besides, we give a brief introduction of two-dimensional geometry localization method,and compare it with LMS.Another popular method is SRP(Steered Beamforming),which is a one-step localization method,and it delays the phase of making a decision,integrates all the microphone information.,thus makes itself more resistant to noise and reverberation. The corresponding shortcoming is its larger computational cost,hardly suitable for real time processing. Steered beamforming method based on stochastic region contraction (SRP-SRC) avoids global searching,greatly reduces computational cost.In our thesis, we put forward a new method,namely SRP-RSRC to improve SRP-SRC,in which,1)a shaping function is brought in,thus increase the contrast of energy peak,2)set a energy threshold,search the biggest energy point in the volumes where their energy are greater than the threshold,make SRP-SRC convergence more quickly,and is more computational efficient. Besides. A combination of kalman estimation and filter with SRP-RSRC is also raised,which makes tracking more robust and stalbe.In the end,we describle our real acoustic localization system in room environment,including its architecture,software,and hardware. Voice data is acquired from this system ,with which,a comparison is made between GCC-LMS and SRP-SRC,SRP-RSRC.
【Key words】 microphone array; acoustic source localization; TDOA; steered beamforming; stochastic region contraction;