TY - JOUR
T1 - Scale-Consistent Fusion
T2 - From Heterogeneous Local Sampling to Global Immersive Rendering
AU - Xing, Wenpeng
AU - Chen, Jie
AU - Yang, Zaifeng
AU - Wang, Qiang
AU - Guo, Yike
N1 - This work was supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China, under Project T45-205/21-N.
PY - 2022/12
Y1 - 2022/12
N2 - Image-based geometric modeling and novel view synthesis based on sparse large-baseline samplings are challenging but important tasks for emerging multimedia applications such as virtual reality and immersive telepresence. Existing methods fail to produce satisfactory results due to the limitation on inferring reliable depth information over such challenging reference conditions. With the popularization of commercial light field (LF) cameras, capturing LF images (LFIs) is as convenient as taking regular photos, and geometry information can be reliably inferred. This inspires us to use a sparse set of LF captures to render high-quality novel views globally. However, the fusion of LF captures from multiple angles is challenging due to the scale inconsistency caused by various capture settings. To overcome this challenge, we propose a novel scale-consistent volume rescaling algorithm that robustly aligns the disparity probability volumes (DPV) among different captures for scale-consistent global geometry fusion. Based on the fused DPV projected to the target camera frustum, novel learning-based modules (i.e., the attention-guided multi-scale residual fusion module, and the disparity field-guided deep re-regularization module), which comprehensively regularize noisy observations from heterogeneous captures for high-quality rendering of novel LFIs, have been proposed. Both quantitative and qualitative experiments over the Stanford Lytro Multi-view LF dataset show that the proposed method outperforms state-of-the-art methods significantly under different experiment settings for disparity inference and LF synthesis.
AB - Image-based geometric modeling and novel view synthesis based on sparse large-baseline samplings are challenging but important tasks for emerging multimedia applications such as virtual reality and immersive telepresence. Existing methods fail to produce satisfactory results due to the limitation on inferring reliable depth information over such challenging reference conditions. With the popularization of commercial light field (LF) cameras, capturing LF images (LFIs) is as convenient as taking regular photos, and geometry information can be reliably inferred. This inspires us to use a sparse set of LF captures to render high-quality novel views globally. However, the fusion of LF captures from multiple angles is challenging due to the scale inconsistency caused by various capture settings. To overcome this challenge, we propose a novel scale-consistent volume rescaling algorithm that robustly aligns the disparity probability volumes (DPV) among different captures for scale-consistent global geometry fusion. Based on the fused DPV projected to the target camera frustum, novel learning-based modules (i.e., the attention-guided multi-scale residual fusion module, and the disparity field-guided deep re-regularization module), which comprehensively regularize noisy observations from heterogeneous captures for high-quality rendering of novel LFIs, have been proposed. Both quantitative and qualitative experiments over the Stanford Lytro Multi-view LF dataset show that the proposed method outperforms state-of-the-art methods significantly under different experiment settings for disparity inference and LF synthesis.
KW - Novel view synthesis
KW - light field
KW - disparity probability volumes rescaling
KW - spatial-angular re-regularization
KW - multi-scale residual fusion
UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85138495846&origin=inward
U2 - 10.1109/TIP.2022.3205745
DO - 10.1109/TIP.2022.3205745
M3 - Journal article
SN - 1057-7149
VL - 31
SP - 6109
EP - 6123
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -