节点文献
矢量量化码书设计与矢量量化应用研究
Research on Vector Quantization Codebook Design and Application
【作者】 唐建;
【导师】 郭立;
【作者基本信息】 中国科学技术大学 , 电路与系统, 2006, 博士
【摘要】 伴随着信息与通信等领域的迅速发展,大量的语音、图像等多媒体信息要进行存储、处理与传输,需要很大的存储空间和信道带宽。为了提高存储效率和减小存储空间,在允许的失真条件下,应尽可能地消除媒体信息中的冗余信息。矢量量化技术作为一种有效的有损压缩技术,具有压缩比大、解码算法简单的特点,而成为语音、图像压缩编码的重要技术之一。矢量量化(VQ)技术不单是用于信息压缩,现已发展到说话人识别、数字水印、语音识别、图像识别、文献检索等领域。因此,矢量量化技术具有重要的研究价值。 矢量量化有三个方面的关键技术:码书设计、码字搜索和码字索引分配,其中码书设计是首要问题。码书设计的主要目标是找到训练矢量的一个最佳分类,即将M个k维的训练矢量分成N个类别的最佳方案。压缩编码,基于矢量量化的说话人识别,和基于矢量量化的数字水印都需要一个性能良好的码书。本文重点分析了经典的码书设计算法,研究了改进方案和码书优化算法。在此基础上,进一步地结合语音信号处理和图像处理有关理论和技术,研究了新的矢量量化说话人识别方案和矢量量化数字水印技术。 本文的主要工作和特色如下: 1.码书设计经典算法LBG得到了广泛应用,但该算法一般得到一个局部最优码书。利用进化算法优化码书是近年来的一个重要研究方向,其关键是需求合理的优化方案和优化搜索方法。目前有基于码书和基于训练矢量划分两种方案,但这两种方案将导致进化算法的优化搜索空间过大,要优化的维数分别是M和N×k,而目前的进化算法还难以有效解决高维的优化,优化效果不明显。本文针对上述两种方案存在的问题,提出了基于最近邻划分变异/矢量空间状态优化的方案。基于该方案,优化搜索在与矢量同维的低维空间中进行,编码空间相对较小,有利于优化算法的性能发挥,可有效地提高码书性能。 2.为了提高码书优化算法的效率,本文分析了进化规划(EP)、概率密度估计算法(EDA)、粒子群优化(PSO)等方法在应用于矢量量化方面存在的问题,并针对性地提出了两种新的混合型优化方法,及相应的码书优化算法: 1)基于概率密度估计算法的进化规划码书设计算法。EP具有找到全局最优码书的进化趋势,但其以变异算子为主要的进化操作,收敛较慢,引入EDA通过概率估计,预测最优个体,为EP提供搜索方向,加速算法的收敛。
【Abstract】 With the rapid development of information, communication and other regions, there have been more and more multimedia information, such as speech, image and so on, needed to been stored, processed and transmitted. Due to the large amount of multimedia information, we need huge storage space and broad channel bandwidth. In order to improve storage efficiency and reduce storage space, the redundancy of multimedia information should been eliminated as much as possible while the introducing distortion is allowable. Vector quantization (VQ) is an efficient lossy compression technique, whose prominent virtues are high compression ratio and simple decoding process, so it has become one of important compression techniques for speech and image coding. VQ technique is not only useful for information compression, but speaker identification, digital watermarking, speech recognition, image recognition, literature retrieval and so on. So, VQ technique is of great researching value.As to VQ, there are three key techniques, i.e., codebook design, codeword search and codeword index assignment, and codebook design is the most important comparing with the two other techniques. Codebook design is mostly to obtain an optimal classification for the training vectors, that means getting the optimal scheme for dividing the M k-dimension training vectors to N classifications. An efficient codebook is critical for compression coding, VQ-based speaker recognition and VQ-based digital watermarking. This dissertation stresses analyzing the classic algorithms of codebook design, meanwhile investigates the improved scheme and optimization algorithm for codebook design. On the base of these researches mentioned above, combining with relevant theory and technique of speech signal process and image process, novel VQ-based speaker recognition strategy and VQ-based digital watermarking technique are put in focus.To sum up, the main work and feature of this paper are as follows:1. The classic codebook design algorithm LBG has been used in broad application, but this algorithm usually gets a local optimal codebook. It has been an important researching orientation to optimize codebook by revolutionary algorithms in the