TY - GEN
T1 - View and rate scalable multiview image coding with depth-image-based rendering
AU - Velisavljević, Vladan
AU - Stankovič, Vladimir
AU - Chakareski, Jacob
AU - Cheung, Gene
PY - 2011
Y1 - 2011
N2 - Texture plus depth refers to the format where a sender encodes both texture and depth maps at multiple camera-captured viewpoints. Having received such a representation, the decoder can synthesize novel intermediate view images via depth-image-based rendering (DIBR), using as anchors the texture and depth maps of the two closest captured viewpoints. Ideally then, one would optimally allocate available source coding bits among the encoded texture and depth maps, such that the synthesized view distortion is minimized. However, in many practical application scenarios the precise rate constraint may either: i) be unknown at encoding time, or ii) it can take on multiple values for clients of heterogeneous connectivities. In this paper, we propose a flexible codec and an associated bit allocation strategy to address both of these scenarios. In particular, we first present an edge-adaptive wavelet multiview image codec capable of producing a scalable bitstream from which proper subsets can be extracted and decoded at different bit-rates. Given our scalable codec, we then propose a rate allocation algorithm that performs one of the following two actions. The algorithm will either incrementally increase the number of bits for encoding texture or depth maps of already encoded viewpoints, or it will introduce into the scalable representation new texture or depth maps of previously uncoded captured viewpoints. The incremental choice of either refining an existing view or introducing a new one is carried out one layer at a time, such that the associated rate-distortion tradeoff is locally optimized. By employing our novel bit allocation strategy the proposed coder outperforms the state-of-the-art H.264/SVC codec as well as the same wavelet-based coder when armed with a simple suboptimal bit allocation with the same rate allocated to each map, in all coding scenarios studied in our experiments. Furthermore, our coder can achieve an arbitrarily fine granularity of encoding bit rates, while providing the additional functionality of view embedded encoding, unlike the other related coders that we examined.
AB - Texture plus depth refers to the format where a sender encodes both texture and depth maps at multiple camera-captured viewpoints. Having received such a representation, the decoder can synthesize novel intermediate view images via depth-image-based rendering (DIBR), using as anchors the texture and depth maps of the two closest captured viewpoints. Ideally then, one would optimally allocate available source coding bits among the encoded texture and depth maps, such that the synthesized view distortion is minimized. However, in many practical application scenarios the precise rate constraint may either: i) be unknown at encoding time, or ii) it can take on multiple values for clients of heterogeneous connectivities. In this paper, we propose a flexible codec and an associated bit allocation strategy to address both of these scenarios. In particular, we first present an edge-adaptive wavelet multiview image codec capable of producing a scalable bitstream from which proper subsets can be extracted and decoded at different bit-rates. Given our scalable codec, we then propose a rate allocation algorithm that performs one of the following two actions. The algorithm will either incrementally increase the number of bits for encoding texture or depth maps of already encoded viewpoints, or it will introduce into the scalable representation new texture or depth maps of previously uncoded captured viewpoints. The incremental choice of either refining an existing view or introducing a new one is carried out one layer at a time, such that the associated rate-distortion tradeoff is locally optimized. By employing our novel bit allocation strategy the proposed coder outperforms the state-of-the-art H.264/SVC codec as well as the same wavelet-based coder when armed with a simple suboptimal bit allocation with the same rate allocated to each map, in all coding scenarios studied in our experiments. Furthermore, our coder can achieve an arbitrarily fine granularity of encoding bit rates, while providing the additional functionality of view embedded encoding, unlike the other related coders that we examined.
KW - Multiview imaging
KW - bit allocation
KW - depth-image-based rendering
KW - view and rate scalable encoding
UR - http://www.scopus.com/inward/record.url?scp=80053163910&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80053163910&partnerID=8YFLogxK
U2 - 10.1109/ICDSP.2011.6005019
DO - 10.1109/ICDSP.2011.6005019
M3 - Conference contribution
AN - SCOPUS:80053163910
SN - 9781457702747
T3 - 17th DSP 2011 International Conference on Digital Signal Processing, Proceedings
BT - 17th DSP 2011 International Conference on Digital Signal Processing, Proceedings
T2 - 17th International Conference on Digital Signal Processing, DSP 2011
Y2 - 6 July 2011 through 8 July 2011
ER -