TY - GEN
T1 - SphereFusion: Efficient Panorama Depth Estimation via Gated Fusion
AU - Yan, Qingsong
AU - Wang, Qiang
AU - Zhao, Kaiyong
AU - Chen, Jie
AU - Li, Bo
AU - Chu, Xiaoweo
AU - Deng, Fei
N1 - This work was supported by the Shenzhen Science and Technology Program (Grant No. KJZD20240903 104103005, JCYJ20220818102414030, JSGGKQTD202 21101115655027).
Publisher Copyright:
© 2025 IEEE.
PY - 2025/8/25
Y1 - 2025/8/25
N2 - Due to the rapid development of panorama cameras, the task of estimating panorama depth has attracted significant attention from the computer vision community, especially in applications such as robot sensing and autonomous driving. However, existing methods relying on different projection formats often encounter challenges, either struggling with distortion and discontinuity in the case of equirectangular, cubemap, and tangent projections, or experiencing a loss of texture details with the spherical projection. To tackle these concerns, we present SphereFusion, an end-toend framework that combines the strengths of various projection methods. Specifically, SphereFusion initially employs $2 D$ image convolution and mesh operations to extract two distinct types of features from the panorama image in both equirectangular and spherical projection domains. These features are then projected onto the spherical domain, where a gate fusion module selects the most reliable features for fusion. Finally, SphereFusion estimates panorama depth within the spherical domain. Meanwhile, SphereFusion employs a cache strategy to improve the efficiency of mesh operation. Extensive experiments on three public panorama datasets demonstrate that SphereFusion achieves competitive results with other state-of-theart methods, while presenting the fastest inference speed at only 17 ms on a 512 × 1024 panorama image.
AB - Due to the rapid development of panorama cameras, the task of estimating panorama depth has attracted significant attention from the computer vision community, especially in applications such as robot sensing and autonomous driving. However, existing methods relying on different projection formats often encounter challenges, either struggling with distortion and discontinuity in the case of equirectangular, cubemap, and tangent projections, or experiencing a loss of texture details with the spherical projection. To tackle these concerns, we present SphereFusion, an end-toend framework that combines the strengths of various projection methods. Specifically, SphereFusion initially employs $2 D$ image convolution and mesh operations to extract two distinct types of features from the panorama image in both equirectangular and spherical projection domains. These features are then projected onto the spherical domain, where a gate fusion module selects the most reliable features for fusion. Finally, SphereFusion estimates panorama depth within the spherical domain. Meanwhile, SphereFusion employs a cache strategy to improve the efficiency of mesh operation. Extensive experiments on three public panorama datasets demonstrate that SphereFusion achieves competitive results with other state-of-theart methods, while presenting the fastest inference speed at only 17 ms on a 512 × 1024 panorama image.
UR - https://www.scopus.com/pages/publications/105016124156
U2 - 10.1109/3DV66043.2025.00084
DO - 10.1109/3DV66043.2025.00084
M3 - Conference proceeding
AN - SCOPUS:105016124156
T3 - Proceedings - International Conference on 3D Vision, 3DV
SP - 855
EP - 865
BT - Proceedings - 2025 International Conference on 3D Vision, 3DV 2025
PB - IEEE
T2 - 12th International Conference on 3D Vision, 3DV 2025
Y2 - 25 March 2025 through 28 March 2025
ER -