èŠ‚ç‚¹æ–‡çŒ®

å¤šçº§è¯éŸ³æ£€ç´¢çš„é‡‘å—å¡”ç®—æ³•

A hierarchical speech retrieval pyramid algorithm

æŽ¨è CAJä¸‹è½½
PDFä¸‹è½½
ä¸æ”¯æŒè¿…é›·ç‰ä¸‹è½½å·¥å…·ï¼Œè¯·å–æ¶ˆåŠ é€Ÿå·¥å…·åŽä¸‹è½½ã€‚

ã€Authorã€‘ YUAN Xu-hai WANG Rang-ding(CKC software lab, Ningbo University, Ningbo 315211, China)

ã€æœºæž„ã€‘ å®æ³¢å¤§å¦çºµæ¨ªæ™ºèƒ½è½¯ä»¶ç ”ç©¶æ‰€ï¼›

ã€æ‘˜è¦ã€‘ æå‡ºäº†ä¸€ç§æ”¹è¿›çš„DWT(discretewavelettransform)åŸŸè¯éŸ³æ£€ç´¢ç®—æ³•ã€‚è¯¥æ–¹æ³•åˆ©ç”¨å°æ³¢å˜æ¢çš„å¤šåˆ†è¾¨çŽ‡ç‰¹æ€§,åœ¨å°æ³¢åŸŸçš„ä¸åŒè¿‘ä¼¼åˆ†é‡çº§,å®žçŽ°äº†å¤šçº§æŸ¥è¯¢è¯éŸ³è®°å½•çš„åŠŸèƒ½ã€‚å®žéªŒè¡¨æ˜Ž:æœ¬æ–‡ç®—æ³•èƒ½å¤Ÿåœ¨å»ºç«‹çš„è¯éŸ³åº“ä¸æŸ¥è¯¢åˆ°æ‰€è¦æ±‚çš„è®°å½•,åœ¨å†—ä½™è®°å½•å‡å°‘,è®¡ç®—é‡é™ä½Žå’ŒæŸ¥å‡†çŽ‡æé«˜ä¸‰æ–¹é¢æœ‰äº†å¾ˆå¤§çš„æ”¹è¿›,å…·æœ‰å¹¿é˜”çš„åº”ç”¨å‰æ™¯ã€‚æ›´å¤š è¿˜åŽŸ

ã€Abstractã€‘ With the advances of information technology ,more and more digital audio, imagesand video are being captured, produced and stored. There have been strong research anddevelopment interests in multimedia indexing and retrieval in order to effectively andefficiently use the information stored in these media types.Human being have amazing ability to distinguish different types of audio. Given anyaudio piece, we can instantly tell the type of audio, the mood, and determine its similarity toanother piece of audio. However, a computer sees a piece of audio as a sequence of samplevalues. At the moment, the most common method of accessing audio piece is based on theirtitles or file names. Due to the incompleteness of the file name and text description, it may behard to find audio pieces satisfying the particular requirements of application. To solve theproblems, content based audio retrieval techniques are required.A speech retrieval algorithm in DWT (discrete wavelet transform) domain is presented inPaper[6] based on Paper[7]. Concrete wavelet coefficients are used to compare. Thisalgorithm has two piece of shortage. The one is different speech records have differentlengths, So it is difficult to compare with different speech records. In the other hand, themore length of speech record, the more complicated of this algorithm, the cost of retrievaltime will be increased quickly.An improved speech retrieval algorithm in DWT (discrete wavelet transform) domain ispresented. The function of finding the piece of speech record belonged to certain person inspeech database is achieved. The important characteristics of MRA (Multi-resolutionAnalysis) in wavelet transform are used in the algorithm of this paper. In the differentapproximate level of DWT domain, the function of search speech record in the differenthierarchy levels is achieved. Compared to former speech retrieval technology based on DWTdomain, three piece of statistical characteristic are used instead of wavelet coefficient, andthe performance of this algorithm is improved greatly. Experimental results show thatalgorithm of this paper can find demand speech record of user correctly in speech database,and the capability of reducing redundancy record, predigesting count and improvingprecision ratio is enhanced. This algorithm has a promising future in the application ofspeech retrieval field.æ›´å¤š è¿˜åŽŸ

ã€å…³é”®è¯ã€‘ å°æ³¢å˜æ¢ï¼› è¯éŸ³æ£€ç´¢ï¼› å¤šåˆ†è¾¨çŽ‡åˆ†æžï¼› ç‰¹å¾å‘é‡ï¼›

ã€åŸºé‡‘ã€‘ æµ™æ±Ÿçœè‡ªç„¶ç§‘å¦åŸºé‡‘(Y104144);å®æ³¢å¸‚åšå£«åŸºé‡‘(2005A610003)