节点文献
可伸缩视频编码的率失真分析及其码率控制
Rate-Distortion Analysis and Its Application in Rate Control for Scalable Video Coding
【作者】 谈永敏;
【导师】 杨小康;
【作者基本信息】 上海交通大学 , 通信与信息系统, 2008, 硕士
【摘要】 随着通信技术和网络技术的发展,视频传输不再局限于传统的固定带宽信道。基于典型的分布式系统Internet和无线网络的多媒体业务,例如视频会议、视频点播、手机电视等成为了当前非常有吸引力的应用领域。此类信道时变的异构网络给视频编码提出了前所未有的挑战。针对这样的情况,ISO和ITU的联合工作组JVT开发了新一代的视频编码标准,即可伸缩视频编码(SVC:Scalable Video Coding)。它提供了数种可高效组合的可伸缩特性,使得由单个压缩码流能够提供多个子样本,终端用户可以根据自己的需要选择或截取相应码流部分进行解码。本文的研究工作正是从可伸缩视频编码出发,主要基于率失真的分析来解决视频编码的重要环节码率控制在可伸缩新环境下的应用,另外还讨论了编码的另外一种约束:恒定质量编码。本文首先简要介绍了可伸缩视频编码的主要技术,按照时间、空间、质量三种基本可伸缩特性阐述了各自的实现方案,进而指出了传统视频编码中的率失真优化理论在可伸缩环境下的应用,作为后文的技术背景。然后,我们对码率控制问题作了详细的定义并深入分析了H.264/SVC码率控制的技术难点。我们首先提出了一种针对SVC与H.264/AVC兼容的基本层的码率控制算法:通过引入预编码过程解决了H.264中著名的由于采用依赖量化参数的率失真优化模式选择而带来的蛋鸡悖论;通过在率失真优化过程中为每种候选模式单独确定精细化的量化参数来更好地反映特定宏块的率失真特性;通过限制缓存区充盈程度并结合率失真代价进行目标比特的分配。接着,我们针对SVC的Hierarchical B预测结构提出了一种码率控制方案,其中充分考虑了编码时一个图像组内属于不同时域分级图像帧的重要性不同的特性,在进行目标比特分配时给予不同的权重因子。随后,我们将基本层的码率控制算法扩展到增强层,并同时进行基本层和增强层的码率控制。由于我们的算法具有较强的去耦合性,因此可以基本沿用原有的码率控制模型和各种参数设置。针对增强层的情况并考虑到实现复杂度,采取SVC新引入的宏块预测模式BLSkip作为预编码的模式。此外,我们还提出了一种简化率失真优化模式选择的方案,根据各候选模式已出现的统计分布调整模式选择时的顺序,并同时结合提前终止阈值判断避免对不必要模式的检查。这样就可以降低码率控制算法在可伸缩编码时的总体复杂度,加快运行速度。最后,我们对恒定质量编码问题进行了研究。我们观察到当采取Hierarchical B结构编码时,由于编码顺序的问题会造成帧间的质量波动,且此现象在场景切换时尤为明显。我们提出了一种简单有效的基于峰值信噪比和平均绝对误差两种度量的恒定质量控制算法,通过在编码过程中观察图像质量的变化情况,选择合适的量化参数来尽可能地控制图像质量的波动范围。
【Abstract】 With the rapid developments of communication and network technologies, video transmission is no longer restricted in traditional channels with fixed bandwidths. Multimedia services based on large distributed systems like Internet and wireless networks are very attractive now, such as video conferences, video-on-demand, mobile TV and etc. Such heterogeneous networks with time-varying characteristic bring about unprecedented challenges to reliable and efficient video transmission. In response to such requirements, JVT developed a new generation of video coding standard: scalable video coding (SVC). SVC provides several exciting scalabilities, enabling the original video to have a variety of subversions with low quality. Users with different capabilities can decode different parts of the bitstream according to their actual needs. The works in this thesis are based on SVC. We present a rate control algorithm for H.264/AVC base layer and SVC as well. We also propose a constant quality algorithm to restrict quality fluctuations in Hierarchical B structure.In the first part, we briefly outline the main techniques used by SVC and their implementations, in the order of temporal, spatial and quality scalability. We discuss about potential applications of rate-distortion optimization theory in the new context of SVC.Secondly, we define the rate control problem in details and give an in-depth analysis of the difficulties in rate control for H.264/SVC. We propose an efficient rate control scheme for H.26-compliant base layer with two-step quantization parameter determination but single-pass encoding: we use a pre-analysis stage to solve the famous chicken-and-egg dilemma in H.264 resulting from QP-dependent rate-distortion optimization (RDO); we refine the quantization parameter for each candidate mode in RDO to reflect one macroblock’s rate-distortion information more accurately; our frame-level target bits allocation strategy considers the encoder buffer fullness combined with rate-distortion costs of past encoded frames. Then, we present a rate control algorithm for Hierarchical B structure, taking into account the different importance of different temporal level pictures in one GOP. Different weighting factors are used when allocating target bits.We extend our rate control algorithm to the enhancement layer in the third part. Since our rate control model decouples between layers, we can simply inherit the original model and parameter settings. When implementing rate control in enhancement layers, we use the new prediction mode BLSkip as the mode in pre-analysis stage. Furthermore, considering the computational complexity, we propose a fast RDO mode decision algorithm. We adjust the priority queue according to the probability distribution of MB coding modes such that the most probable mode will be checked first. The RDO process will be terminated early as soon as RD cost is below a threshold. Therefore, the overall computation burden can be reduced significantly.Finally, we talk about the problem of constant quality control. We observe that quality fluctuation between frames in Hierarchical B structure is obvious, particularly when scene cut occurs. We present a simple but effect constant quality control scheme based on PSNR and MAD measures. A suitable quantization parameter is determined at frame level to make the fluctuation range as small as possible.
- 【网络出版投稿人】 上海交通大学 【网络出版年期】2008年 07期
- 【分类号】TN919.81
- 【被引频次】12
- 【下载频次】458