节点文献
多级语音检索的金字塔算法
A hierarchical speech retrieval pyramid algorithm
【Author】 YUAN Xu-hai WANG Rang-ding(CKC software lab, Ningbo University, Ningbo 315211, China)
【机构】 宁波大学纵横智能软件研究所;
【摘要】 提出了一种改进的DWT(discretewavelettransform)域语音检索算法。该方法利用小波变换的多分辨率特性,在小波域的不同近似分量级,实现了多级查询语音记录的功能。实验表明:本文算法能够在建立的语音库中查询到所要求的记录,在冗余记录减少,计算量降低和查准率提高三方面有了很大的改进,具有广阔的应用前景。
【Abstract】 With the advances of information technology ,more and more digital audio, imagesand video are being captured, produced and stored. There have been strong research anddevelopment interests in multimedia indexing and retrieval in order to effectively andefficiently use the information stored in these media types.Human being have amazing ability to distinguish different types of audio. Given anyaudio piece, we can instantly tell the type of audio, the mood, and determine its similarity toanother piece of audio. However, a computer sees a piece of audio as a sequence of samplevalues. At the moment, the most common method of accessing audio piece is based on theirtitles or file names. Due to the incompleteness of the file name and text description, it may behard to find audio pieces satisfying the particular requirements of application. To solve theproblems, content based audio retrieval techniques are required.A speech retrieval algorithm in DWT (discrete wavelet transform) domain is presented inPaper[6] based on Paper[7]. Concrete wavelet coefficients are used to compare. Thisalgorithm has two piece of shortage. The one is different speech records have differentlengths, So it is difficult to compare with different speech records. In the other hand, themore length of speech record, the more complicated of this algorithm, the cost of retrievaltime will be increased quickly.An improved speech retrieval algorithm in DWT (discrete wavelet transform) domain ispresented. The function of finding the piece of speech record belonged to certain person inspeech database is achieved. The important characteristics of MRA (Multi-resolutionAnalysis) in wavelet transform are used in the algorithm of this paper. In the differentapproximate level of DWT domain, the function of search speech record in the differenthierarchy levels is achieved. Compared to former speech retrieval technology based on DWTdomain, three piece of statistical characteristic are used instead of wavelet coefficient, andthe performance of this algorithm is improved greatly. Experimental results show thatalgorithm of this paper can find demand speech record of user correctly in speech database,and the capability of reducing redundancy record, predigesting count and improvingprecision ratio is enhanced. This algorithm has a promising future in the application ofspeech retrieval field.
- 【会议录名称】 第一届建立和谐人机环境联合学术会议(HHME2005)论文集
- 【会议名称】第一届建立和谐人机环境联合学术会议(HHME2005)
- 【会议时间】2005-10
- 【会议地点】中国昆明
- 【分类号】TP391.42
- 【主办单位】中国计算机学会、中国图象图形学学会、ACM SIGCHI中国分会、清华大学计算机科学与技术系