Abstract
Dense stereo matching estimates the depth for each pixel of the referenced images. Recently, deep learning algorithms have dramatically promoted the development of stereo matching. The state-of-the-art result is achieved by models adopting deep convolutional neural networks. However, a considerable computational burden is also introduced, which slows the inference. To solve this problem, previous works down-sampled the input images to decrease the spatial size. However, down-sampling increases the error rate and its lower bound. In this article, we accelerate stereo matching algorithms through the improvement of network structure. Inspired by network compression, we conduct decomposition and sparsification to squeeze the computationally expensive cost optimization network. It is sparsified and then decomposed into smaller networks, which are designed and trained in a cascaded manner to reach the nearest possible performance of the larger network. Previous methods have utilized numerous refinement methods to adjust the coarse disparity. We integrate refinement methods to create an unified algorithm to utilize parallelism for running devices to further accelerate the inference. The extensive experiments on Kitti2015, Kitti2012, and Middlebury datasets demonstrate the efficiency of our method.
- [1] . 2018. CBMV: A coalesced bidirectional matching volume for disparity estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2060–2069.Google Scholar
Cross Ref
- [2] . 2018. Pyramid stereo matching network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5410–5418.Google Scholar
Cross Ref
- [3] . 2015. FlowNet: Learning optical flow with convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. 2758–2766.Google Scholar
Digital Library
- [4] . 2007. Foveated gaze-contingent displays for peripheral LOD management, 3D visualization, and stereo imaging. ACM Transactions on Multimedia Computing, Communications, and Applications 3, 4 (Dec. 2007), Article 6, 18 pages.
DOI: Google ScholarDigital Library
- [5] . 2002. On the origin of the bilateral filter and ways to improve it. IEEE Transactions on Image Processing 11, 10 (2002), 1141–1151.Google Scholar
Digital Library
- [6] . 2016. PerforatedCNNs: Acceleration through elimination of redundant convolutions. In Advances in Neural Information Processing Systems. 947–955.Google Scholar
- [7] . 2013. Vision meets robotics: The KITTI dataset. International Journal of Robotics Research 32, 11 (2013), 1231–1237.Google Scholar
Digital Library
- [8] . 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 3354–3361.Google Scholar
Digital Library
- [9] . 2019. Advanced stereo seam carving by considering occlusions on both sides. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 3 (Aug. 2019), Article 69, 21 pages.
DOI: Google ScholarDigital Library
- [10] . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
Cross Ref
- [11] . 2007. Stereo processing by semiglobal matching and mutual information. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 2 (2007), 328–341.Google Scholar
Digital Library
- [12] . 2018. Left-right comparative recurrent model for stereo matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3838–3846.Google Scholar
Cross Ref
- [13] . 1994. A stereo matching algorithm with an adaptive window: Theory and experiment. IEEE Transactions on Pattern Analysis and Machine Intelligence 16, 9 (1994), 920–932.Google Scholar
Digital Library
- [14] . 2018. Stereonet: Guided hierarchical refinement for real-time edge-aware depth prediction. In Proceedings of the European Conference on Computer Vision (ECCV’18). 573–590.Google Scholar
Cross Ref
- [15] . 2017. End-to-end training of hybrid CNN-CRF models for stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2339–2348.Google Scholar
Cross Ref
- [16] . 2004. A method for learning matching errors for stereo computation. In Proceedings of the British Machine Vision Conference (BMVC’04), Vol. 1. 2.Google Scholar
Cross Ref
- [17] . 2006. Stereo matching via learning multiple experts behaviors. In Proceedings of the British Machine Vision Conference (BMVC’06), Vol. 1. 2.Google Scholar
Cross Ref
- [18] . 2017. End-to-end learning of cost-volume aggregation for real-time dense stereo. In Proceedings of the 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP’17). IEEE, Los Alamitos, CA, 1–6.Google Scholar
Cross Ref
- [19] . 2015. Deep learning. Nature 521, 7553 (2015), 436.Google Scholar
Digital Library
- [20] . 2016. PMSC: Patchmatch-based superpixel cut for accurate stereo matching. IEEE Transactions on Circuits and Systems for Video Technology 28, 3 (2016), 679–692.Google Scholar
Cross Ref
- [21] . 2018. Learning for disparity estimation through feature constancy. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2811–2820.Google Scholar
Cross Ref
- [22] . 1989. Kalman filter-based algorithms for estimating depth from image sequences. International Journal of Computer Vision 3, 3 (1989), 209–238.Google Scholar
Cross Ref
- [23] . 2016. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4040–4048.Google Scholar
Cross Ref
- [24] . 2011. On building an accurate stereo matching system on graphics hardware. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops’11). IEEE, Los Alamitos, CA, 467–474.Google Scholar
Cross Ref
- [25] . 2019. Multi-level context ultra-aggregation for stereo matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3283–3291.Google Scholar
Cross Ref
- [26] . 2017. Cascade residual learning: A two-stage convolutional neural network for stereo matching. In Proceedings of the IEEE International Conference on Computer Vision. 887–895.Google Scholar
Cross Ref
- [27] . 2017. Automatic differentiation in PyTorch. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS’17).Google Scholar
- [28] . 2012. Towards a simulation driven stereo vision system. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR’12). IEEE, Los Alamitos, CA, 1038–1042.Google Scholar
- [29] . 2014. High-resolution stereo datasets with subpixel-accurate ground truth. In Proceedings of the German Conference on Pattern Recognition. 31–42.Google Scholar
Cross Ref
- [30] . 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision 47, 1–3 (2002), 7–42.Google Scholar
Digital Library
- [31] . 2018. Learning to fuse proposals from multiple scanline optimizations in semi-global matching. In Proceedings of the European Conference on Computer Vision (ECCV’18). 739–755.Google Scholar
Cross Ref
- [32] . 2017. SGM-Nets: Semi-global matching with neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google Scholar
Cross Ref
- [33] . 2015. Convolutional neural networks with low-rank regularization. arXiv Preprint arXiv:1511.06067 (2015).Google Scholar
- [34] . 2017. Continuous 3D label stereo matching using local expansion moves. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 11 (2017), 2725–2739.Google Scholar
Digital Library
- [35] . 2019. Real-time self-adaptive deep stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 195–204.Google Scholar
Cross Ref
- [36] . 2018. Practical deep stereo (PDS): Toward applications-friendly deep stereo matching. In Advances in Neural Information Processing Systems. 5871–5881.Google Scholar
- [37] . 2019. Hierarchical discrete distribution decomposition for match density estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6044–6053.Google Scholar
Cross Ref
- [38] . 2016. Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research 17, 1 (2016), 2287–2318.Google Scholar
Digital Library
- [39] . 2018. Accelerating convolutional neural networks by removing interspatial and interkernel redundancies. IEEE Transactions on Cybernetics 50, 2 (2018), 452–464.Google Scholar
- [40] . 2019. GA-Net: Guided aggregation net for end-to-end stereo matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 185–194.Google Scholar
Cross Ref
- [41] . 2017. Fundamental principles on learning new features for effective dense matching. IEEE Transactions on Image Processing 27, 2 (2017), 822–836.Google Scholar
Cross Ref
- [42] . 2009. Cross-based local stereo matching using orthogonal integral images. IEEE Transactions on Circuits and Systems for Video Technology 19, 7 (2009), 1073–1079.Google Scholar
Digital Library
- [43] . 2015. Accelerating very deep convolutional networks for classification and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 10 (2015), 1943–1955.Google Scholar
Digital Library
- [44] . 2013. SIFT match verification by geometric coding for large-scale partial-duplicate web image search. ACM Transactions on Multimedia Computing, Communications, and Applications 9, 1 (Feb. 2013), Article 4, 18 pages. Google Scholar
Digital Library
Index Terms
CRAR: Accelerating Stereo Matching with Cascaded Residual Regression and Adaptive Refinement
Recommendations
Binocular structured light stereo matching approach for dense facial disparity map
AI'11: Proceedings of the 24th international conference on Advances in Artificial IntelligenceBinocular stereo vision technology shows a particular interesting for face recognition, in which the accurate stereo matching is the key issue for obtaining dense disparity map used for exploiting 3D shape information of object. This paper proposed a ...
Feature back-projection guided residual refinement for real-time stereo matching network
AbstractIn recent stereo matching research, deep convolutional neural networks (CNNs) have shown excellent performance to estimate depth from stereo image pairs. Previous works mainly focus on improving the robust performance of the stereo ...
Highlights- We design a lightweight but efficient module to extract features. The module is composed of linear residual network, dilation convolution and spatial ...
Stereo vision using two PTZ cameras
The research of traditional stereo vision is mainly based on static cameras. As PTZ (Pan-Tilt-Zoom) cameras are able to obtain multi-view-angle and multi-resolution information, they have received more and more concern in both research and real ...






Comments