TY - GEN
T1 - FADNet
T2 - 2020 IEEE International Conference on Robotics and Automation, ICRA 2020
AU - Wang, Qiang
AU - Shi, Shaohuai
AU - Zheng, Shizhen
AU - Zhao, Kaiyong
AU - Chu, Xiaowen
N1 - Funding information:
This research was supported by Hong Kong RGC GRF grant HKBU 12200418. We thank the anonymous reviewers for their constructive comments and suggestions. We would also like to thank NVIDIA AI Technology Centre (NVAITC) for providing the GPU clusters for some experiments.
Publisher copyright:
© 2020 IEEE
PY - 2020/5
Y1 - 2020/5
N2 - Deep neural networks (DNNs) have achieved great success in the area of computer vision. The disparity estimation problem tends to be addressed by DNNs which achieve much better prediction accuracy in stereo matching than traditional hand-crafted feature based methods. On one hand, however, the designed DNNs require significant memory and computation resources to accurately predict the disparity, especially for those 3D convolution based networks, which makes it difficult for deployment in real-time applications. On the other hand, existing computation-efficient networks lack expression capability in large-scale datasets so that they cannot make an accurate prediction in many scenarios. To this end, we propose an efficient and accurate deep network for disparity estimation named FADNet with three main features: 1) It exploits efficient 2D based correlation layers with stacked blocks to preserve fast computation; 2) It combines the residual structures to make the deeper model easier to learn; 3) It contains multi-scale predictions so as to exploit a multi-scale weight scheduling training technique to improve the accuracy. We conduct experiments to demonstrate the effectiveness of FADNet on two popular datasets, Scene Flow and KITTI 2015. Experimental results show that FADNet achieves state-of-the-art prediction accuracy, and runs at a significant order of magnitude faster speed than existing 3D models. The codes of FADNet are available at https://github.com/HKBU-HPML/FADNet.
AB - Deep neural networks (DNNs) have achieved great success in the area of computer vision. The disparity estimation problem tends to be addressed by DNNs which achieve much better prediction accuracy in stereo matching than traditional hand-crafted feature based methods. On one hand, however, the designed DNNs require significant memory and computation resources to accurately predict the disparity, especially for those 3D convolution based networks, which makes it difficult for deployment in real-time applications. On the other hand, existing computation-efficient networks lack expression capability in large-scale datasets so that they cannot make an accurate prediction in many scenarios. To this end, we propose an efficient and accurate deep network for disparity estimation named FADNet with three main features: 1) It exploits efficient 2D based correlation layers with stacked blocks to preserve fast computation; 2) It combines the residual structures to make the deeper model easier to learn; 3) It contains multi-scale predictions so as to exploit a multi-scale weight scheduling training technique to improve the accuracy. We conduct experiments to demonstrate the effectiveness of FADNet on two popular datasets, Scene Flow and KITTI 2015. Experimental results show that FADNet achieves state-of-the-art prediction accuracy, and runs at a significant order of magnitude faster speed than existing 3D models. The codes of FADNet are available at https://github.com/HKBU-HPML/FADNet.
UR - http://www.scopus.com/inward/record.url?scp=85089242614&partnerID=8YFLogxK
U2 - 10.1109/ICRA40945.2020.9197031
DO - 10.1109/ICRA40945.2020.9197031
M3 - Conference proceeding
AN - SCOPUS:85089242614
SN - 9781728173962
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 101
EP - 107
BT - 2020 IEEE International Conference on Robotics and Automation, ICRA 2020
PB - IEEE
Y2 - 31 May 2020 through 31 August 2020
ER -