节点文献

新一代高效视频编码研究

Research on High Efficiency Video Coding

【作者】 王勇

【导师】 朱策;

【作者基本信息】 电子科技大学 , 电子与通信工程(专业学位), 2016, 硕士

【摘要】 随着信息化大数据时代的来临,已经对如何传输与存储以高清/超高清、高码率视频以及高保真的3D立体视频为主的海量数据提出了更高的要求。对此,联合视频编码专家组JCT-VC(Joint Collaborative Team on Video Coding)颁布了新一代视频国际压缩标准-高效视频编码标准HEVC(High Efficiency Video Coding,HEVC),并进一步扩展到三维,发展了基于3D的视频编解码国际标准3D-HEVC。HEVC保留了H.264预测+变换的混合模式,同时又新颖地引入了包括灵活的四叉树块分割、精细帧内预测、DST/DCT联合变换等在内的多种新理念与新技术;3D-HEVC在HEVC的基础上进一步引入了反映物体距离相机的深度图,发展了基于多视点+深度的3D视频编解码系统。HEVC与3D-HEVC标准均引入了多种新的编码方法来提高编码效率,但是这些方法仍然有待进一步优化。本文围绕2D与3D的新一代高效视频编码展开以下研究探索工作,分别对其进行高效视频编解码优化,本文的主要学术贡献如下:1.HEVC视频编码优化(1).预测变换编码分析与优化:首先简化了HEVC中的直流DC预测,对DC的均值计算进行一定的精简,在标准编码配置下均能获得一定的编码增益,且降低了编码的复杂度。其次分析HEVC采用的核心DCT与DST变换的原理及其实现过程,然后通过实验仿真分析验证DST变换的有效性,并提出基于帧内预测模式的DST/DCT联合变换方案,实验表明联合变换方案能够获得部分增益;最后对DCT核矩阵的标准正交性进行优化,实验表明优化的变换矩阵,平均能够节省0.1%的码率,部分序列能达到0.4%。(2).率失真优化编码:为实现最大化视频编码效率,率失真优化技术贯穿于整个编码系统。通过统计分析,首次提出了基于GOP级的RA拉格朗日乘子优化算法,进一步完善了率失真优化档次与过程。实验结果表明,提出的算法在随机插入(Random Access,RA)配置下能获得平均0.7%的BD-Rate节省,提升编码性能显著。2.3D-HEVC视频深度建模模式编码优化(1).分区常数值预测以及率失真优化:深度建模模式(Depth Modeling Modes,DMMs)模式的引入,显著提升了三维视频的编码性能。为了进一步更好地利用深度建模模式预测,提出了3种方法来提升编码效率。首先提出一种结合物体真实边界信息的增强分区常数值(Constant Partition Value,CPV)预测算法;其次利用视点合成优化(View Synthesis Optimization,VSO)进一步选择最佳CPV组合;最后在率失真优化过程中引入零残差编码。实验结果表明,三种方法最终可使合成视分别在全帧内(All-Intra,AI)与RA配置下获得平均0.2%和0.1%的BD-Rate节省,提升了编码性能且减少了编码复杂度。(2).锲形分割(Wedgelet,DMM-1)模式内存优化:为了尽可能详细准确地表征锲形分割,在锲形查询表中存储了众多的锲形模板,但是这样的方式也占用了较多的缓冲内存。通过分析查询表的生成过程,首次提出下采样-旋转的查询表联合简化方法,实验结果表明,该联合简化模型在几乎不影响编码性能的情况下能够节省75.1%的编码缓冲内存。(3).轮廓分割(Contour,DMM-4)模式优化:轮廓分割利用已编码纹理信息进行深度分区推导,然其仅利用了纹理-深度间的结构相关性,而忽略了边界块间的边缘相似性,这潜在导致推导块分割不准确,进而影响编码效率。针对DMM-4模式导致的分割不准确问题,结合深度相邻块间的边缘相似信息,提出了一种轮廓分割增强优化方法。实验结果表明该方法在全帧内(All-Intra,AI)编码测试条件下,对于合成视点平均有近0.1%的BD-Rate节省。

【Abstract】 As the age of big data concentration for information is coming, the problems of transmitting, storing and processing efficiently mass data of ultra/high-definition(HD/ UHD) and 3D video are needed to be solved immediately. In order to substantially improve the coding efficiency of the H.264/AVC video coding, a Joint Collaborative Team on Video Coding(JCT-VC) from ITU-T and ISO/IEC was formed for the development of a new international standard, so-called High Efficiency Video Coding(HEVC). And a 3D video extension of HEVC standard known as 3D-HEVC has been initially formed. Similar to H.264, HEVC adopts the conventional block-based hybrid video coding frame work, and offers many more flexible and advance technologies by introducing quadtree block segmentation, DST/DCT transform and so on. Based on HEVC, 3D-HEVC develops multi-view plus depth video coding system by further introducing the depth information which reflects the distance between the object and the camera.The standards HEVC and 3D-HEVC improve the coding efficiency by introducing many new methods, but there are still some places need to be optimized. Aiming to further enhance the coding performance of HEVC and 3D-HEVC standard, we study on high efficiency video coding techniques for 2D and 3D video. The primary work of the thesis is as follow:1. HEVC video encoding optimization(1). Analysis and optimization of Intra prediction optimization and Joint Transform coding for HEVC. To predict the DC value more efficiently, we first simplify the generative process of DC prediction in HEVC. Experimental results demonstrate that about 0.02% BD-rate saving can be achieved with less complexity, under the All-Intra(AI) and Low Delay(LD) configurations. Then we analyze the Discrete Cosine Transform(DCT) and Discrete Sine Transform(DST) of HEVC, and through the experimental simulation analysis the coding performance between DST and Discrete Cosine Transform(DCT). Finally, we propose DST/DCT transform scheme based intra mode and optimize the transformational matrix of DCT based orthogonality. The experimental results justify effectiveness of our proposed schemes.(2). Rate Distortion Optimization(RDO). To optimize coding performance in the block-based hybrid coding architecture, RDO techniques are widely employed based on different levels ranging from frame-level to slice-level, to coding tree unit(CTU)-level, coding unit(CU)-level and prediction unit(PU)-level. However the dependency among groups of pictures(GOPs) has not been addressed. Through a statistical analysis of the reference dependency among GOPs under the Random-Access(RA) configuration in the HEVC standard, we then develop a GOP-level Lagrange multiplier optimization scheme to further enhance the coding performance. The experimental results justify effectiveness of our proposed scheme, where 0.7% BD-rate saving can be achieved on average with the RA configuration with a lower overall coding complexity, compared with the latest reference software HM 16.7 of HEVC standard.2. Optimization of Depth Modeling Modes in 3D-HEVC Depth Intra Coding(1). Enhanced prediction of Constant Partition Value(CPV) and Rate distortion optimization. In the development of 3D-HEVC, Depth Modeling Modes(DMMs) are introduced in depth intra coding to represent object edges in depth maps. With the DMMs, a depth block is approximated by partitioning the block into two non-rectangular regions using Wedgelet or Contour partition, where each region is represented by a constant value referred to as CPV. To predict the CPV more accurately and efficiently, we develop three approaches in this paper. First, a better CPV predictor may be obtained by simply extending the actual depth map boundary, which can also simplify the CPV prediction by removing comparisons and average operations. Second, we propose to choose an optimal combination of delta CPVs in terms of View Synthesis Optimization(VSO) at the encoder by checking more candidates. Finally, zero residual coding is suggested for DMMs coding units in the rate-distortion optimization loop. Experimental results demonstrate that about 0.2% and 0.1% BD-rate saving can be achieved on average for synthesized views with less complexity, under the AI and RA configurations, respectively.(2). Optimization of Wedgelet pattern segmentation(DMM-1). For a better representation of edges in depth maps, Depth Modeling Modes(DMMs) are added into depth intra prediction modes. The Explicit Wedgelet Signalization within DMM partitions the depth block into two non-rectangular regions by using a Wedgelet pattern selected from the Wedgelet lookup table. The Wedgelet lookup table is generated during both encoder and decoder initialization for each block size ranging from 4×4 to 16×16, and the Wedgelet lookup table stores a large number of Wedgelet patterns, which may cause cache storing problem or increase cache burden. In order to reduce the number of Wedgelet partitions in Wedgelet lookup table, we propose a rotation-sampling based method to remove the redundancy among different orientations and different block sizes. Specifically, a down-sampling method is employed to construct the 8×8 and 4×4 Wedgelet patterns so as to reduce the size of lookup table. Then, we remove the generation process of all Wedgelet patterns with some orientations by rotation. The experimental results show the down-sampling method can achieve storage size reduction by 27.8% with negligible 0.03% / 0.05% coding loss in both configurations of Common Test Conditions(CTC) and AI, respectively. Moreover, the rotation technique can achieve 0.04% / 0.03% BD-rate saving on average under AI / CTC. Generally, the proposed rotation-sampling based method can save storage size by 75.1% with no coding loss.(3). Optimization of Contour pattern segmentation(DMM-4). For Contour partition, the corresponding coded texture information of current depth block is employed to predict the final block partition. However, it just utilizes the structural similarity between texture and depth, without considering the edge similarity between adjacent boundary blocks within the same depth map, which may lead to inaccurate block partition, thereby affecting the coding efficiency. To address the problem above, an optimized Contour partition generation method, which makes full use of the edge similarity between adjacent boundary blocks, is proposed. The experimental results demonstrates nearly averaged 0.1% BD-rate saving can be achieved for synthesized views under all intra case, compared with the reference software HTM 11.0.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络