Predictive coding tools in multi-view video compression
收藏Mendeley Data2024-01-31 更新2024-06-28 收录
下载链接:
https://digitallibrary.usc.edu/asset-management/2A3BF1F4Z6Q8
下载链接
链接失效反馈官方服务:
资源简介:
Unrestricted Multi-view video sequences consist of a set of monoscopic video sequences captured at the same time by cameras at different locations and angles. These sequences contain 3-D information that can be used to deliver new 3-D multimedia services. Due to the amount of data, it is important to efficiently compress these multi-view sequences to deliver more accurate 3-D information.; Since the captured frames by adjacent cameras have similar contents, cross-view redundancy can be exploited for disparity compensation. Typically both temporal and cross-view correlations are exploited in multi-view video coding (MVC), so that a frame can use as a reference the previous frame in time in the same view and/or a frame at the same time from an adjacent view, thus leading to a 2-D dependency problem. The disparity of an object depends primarily on its depth in the scene, which can lead to lack of smoothness in the disparity field. These complex disparity fields are further corrupted by the brightness variations between views captured by different cameras. We propose several solutions to solve these problems in block based predictive coding in MVC.; Firstly, the 2-D dependency problem is addressed in Chapter 2. We use the monotonicity property and the correlation between anchor and non-anchor quantizers to reduce the complexity in data collection of an optimization based on the Viterbi algorithm. The proposed bit allocation achieves 0.5 dB coding gains as compared to MVC with fixed QP.; In Chapter 3, we propose an illumination compensation (IC) model to compensate local illumination mismatches. With about 64% additional complexity for IC, 0.3-0.8 dB gains are achieved in cross-view prediction. IC techniques are extended to compensate illumination mismatches both in temporal and cross-view prediction.; In Chapter 4, we seek to enable compensation based on arbitrarily-shaped regions, while preserving an essentially block-based compensation architecture. To do so, we propose tools for implicit block-segmentation and predictor selection. Given two candidate block predictors, segmentation is applied to the difference of predictors. Then a weighted sum of predictors in each segment is selected for prediction. Simulation results show 0.1-0.4 dB gains as compared to the standard quad tree approach in H.264/AVC.
无约束多视角视频序列(Unrestricted Multi-view Video Sequences)由多台不同位置、不同拍摄角度的相机于同一时刻采集的一系列单视点视频序列构成。此类序列携带有三维(3-D)信息,可用于支撑新型三维多媒体服务的交付。鉴于数据体量庞大,高效压缩此类多视角序列以传输更精准的三维信息显得至关重要。
由于相邻相机采集的帧内容相似,可利用跨视角冗余性进行视差补偿。在多视角视频编码(Multi-view Video Coding, MVC)中,通常会同时利用时域与跨视角相关性:某一帧可将同视角下的前序时域帧,或相邻视角下的同时刻帧作为参考帧,由此引发二维依赖问题。物体的视差主要取决于其在场景中的深度,这会导致视差场缺乏平滑性。此类复杂的视差场还会因不同相机采集的视角间亮度差异而进一步劣化。针对基于块的预测编码框架下的多视角视频编码问题,本文提出了若干解决方案。
首先,第二章针对二维依赖问题展开研究。我们利用单调性特性以及锚定量化器与非锚定量化器之间的相关性,降低了基于维特比(Viterbi)算法的优化过程中数据采集的复杂度。相较于采用固定量化参数(Quantization Parameter, QP)的多视角视频编码方案,所提出的比特分配算法可实现0.5dB的编码增益。
第三章中,我们提出了一种光照补偿(Illumination Compensation, IC)模型,用于校正局部光照失配问题。尽管光照补偿模块会带来约64%的额外复杂度,但在跨视角预测中可实现0.3~0.8dB的增益。随后我们将光照补偿技术拓展至时域与跨视角预测中的光照失配校正场景。
第四章中,我们旨在实现基于任意形状区域的补偿,同时保留基于块的补偿架构主体。为此,我们提出了隐式块分割与预测器选择相关技术方案:给定两个候选块预测器,先对预测器差值进行分割,再为每个分割区域选取预测器的加权和用于预测。仿真结果表明,相较于H.264/AVC中的标准四叉树方法,该方案可实现0.1~0.4dB的编码增益。
创建时间:
2024-01-31



