Abstract
High-resolution, low-latency apps in computer vision are ubiquitous in today’s world of mixed-reality devices. These innovations provide a platform that can leverage the improving technology of depth sensors and embedded accelerators to enable higher-resolution, lower-latency processing for 3D scenes using depth-upsampling algorithms. This research demonstrates that filter-based upsampling algorithms are feasible for mixed-reality apps using low-power hardware accelerators. The authors parallelized and evaluated a depth-upsampling algorithm on two different devices: a reconfigurable-logic FPGA embedded within a low-power SoC; and a fixed-logic embedded graphics processing unit. We demonstrate that both accelerators can meet the real-time requirements of 11 ms latency for mixed-reality apps.1
- NVIDIA. 2020. NVIDIA Tensor Cores: Versatility for HPC & AI. Retrieved from https://www.nvidia.com/en-us/data-center/tensor-cores/.Google Scholar
- Passmark. 2020. PassMark PerformanceTest - PC benchmark software. Retrieved from https://www.passmark.com/products/performancetest/.Google Scholar
- Amira Belhedi, Adrien Bartoli, Steve Bourgeois, Vincent Gay-Bellile, Kamel Hamrouni, and Patrick Sayd. 2015. Noise modelling in time-of-flight sensors with application to depth noise removal and uncertainty estimation in three-dimensional measurement. IET Comput. Vis. 9, 6 (2015), 967--977. DOI:https://doi.org/10.1049/iet-cvi.2014.0135Google Scholar
Cross Ref
- Ankita Bhutani and Pallavi Bhardwaj. 2017. Augmented Reality Market Size, Analysis - Industry Share 2017-2024. Retrieved from https://www.gminsights.com/ industry-analysis/augmented-reality-ar-market.Google Scholar
- Atman Binstock. 2015. Powering the Rift. Retrieved from https://www.oculus.com/blog/powering-the-rift/.Google Scholar
- J. Mark Bull. 1999. Measuring synchronisation and scheduling overheads in OpenMP. In Proceedings of the 1st European Workshop on OpenMP, Vol. 8. 49.Google Scholar
- Derek Chan, Hylke Buisman, Christian Theobalt, and Sebastian Thrun. 2008. A noise-aware filter for real-time depth upsampling. In Proceedings of the Workshop on Multi-Camera and Multi-modal Sensor Fusion Algorithms and Applications.Google Scholar
- T. Edeler, K. Ohliger, S. Hussmann, and A. Mertins. 2010. Time-of-flight depth image denoising using prior noise information. In Proceedings of the IEEE 10th International Conference on Signal Processing. 119--122.Google Scholar
- Ivan Eichhardt, Dmitry Chetverikov, and Zsolt Janko. 2017. Image-guided ToF depth upsampling: A survey. Mach. Vis. Applic. 28, 3--4 (2017), 267--282.Google Scholar
- Georgios Evangelidis, Miles Hansard, and Radu Horaud. 2015. Fusion of range and stereo data for high-resolution scene-modeling. IEEE Trans. Pattern Anal. Mach. Intell. 37, 11 (Nov. 2015), 2178--2192. DOI:https://doi.org/10.1109/TPAMI.2015.2400465Google Scholar
Digital Library
- Anna Gabiger-Rose, Matthias Kube, Robert Weigel, and Richard Rose. 2013. An FPGA-based fully synchronized design of a bilateral filter for real-time image denoising. IEEE Trans. Industr. Electron. 61, 8 (2013), 4093--4104.Google Scholar
Cross Ref
- Vineet Gandhi, Jan Čech, and Radu Horaud. 2012. High-resolution depth maps based on TOF-stereo fusion. In Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, 4742--4749.Google Scholar
Cross Ref
- HTC. 2018. VIVE Virtual Reality System. Retrieved from https://www.vive.com/us/product/vive-virtual-reality-system/.Google Scholar
- Xilinx Inc. 2019. Xilinx Zynq UltraScale+ MPSoC ZCU102 Evaluation Kit. Retrieved from https://www.xilinx.com/products/boards-and-kits/ek-u1-zcu102-g.html.Google Scholar
- M. Jordà, P. Valero-Lara, and A. J. Peña. 2019. Performance evaluation of cuDNN convolution algorithms on NVIDIA Volta GPUs. IEEE Access 7 (2019), 70461--70473.Google Scholar
Cross Ref
- Johannes Kopf, Michael F. Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint bilateral upsampling. In ACM Transactions on Graphics, Vol. 26. ACM, 96.Google Scholar
Digital Library
- David Langerman, Sebastian Sabogal, Barath Ramesh, and Alan George. 2018. Accelerating real-time, high-resolution depth upsampling on FPGAs. In Proceedings of the IEEE International Conference on Image Processing, Applications and Systems (IPAS’18). 37--42. DOI:https://doi.org/10.1109/IPAS.2018.8708867Google Scholar
Cross Ref
- K. Mohammad and S. Agaian. 2009. Efficient FPGA implementation of convolution. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. 3478--3483.Google Scholar
- Vladimir Nekrasov, Chunhua Shen, and Ian D. Reid. 2018. Light-weight RefineNet for real-time semantic segmentation. In Proceedings of the British Machine Vision Conference (BMVC’18).Google Scholar
- Nicholas Nethercote and Julian Seward. 2007. Valgrind: A framework for heavyweight dynamic binary instrumentation. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation.Google Scholar
Digital Library
- Daniel Scharstein and Chris Pal. 2007. Learning conditional random fields for stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.Google Scholar
Cross Ref
- Daniel Scharstein and Richard Szeliski. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47, 1--3 (2002), 7--42.Google Scholar
Digital Library
- Daniel Scharstein and Richard Szeliski. 2003. High-accuracy stereo depth maps using structured light. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1. IEEE, I--I.Google Scholar
Cross Ref
- Ryan Shea, Andy Sun, Silvery Fu, and Jiangchuan Liu. 2017. Towards fully offloaded cloud-based AR: Design, implementation and experience. In Proceedings of the 8th ACM on Multimedia Systems Conference. ACM, 321--330.Google Scholar
Digital Library
- H. M. Waidyasooriya and M. Hariyama. 2019. Multi-FPGA accelerator architecture for stencil computation exploiting spacial and temporal scalability. IEEE Access 7 (2019), 53188--53201.Google Scholar
Cross Ref
- K. Wiatr and E. Jamro. 2000. Implementation image data convolutions operations in FPGA reconfigurable structures for real-time vision systems. In Proceedings of the International Conference on Information Technology: Coding and Computing (Cat. No.PR00540). 152--157.Google Scholar
Cross Ref
- Liang Yuan, Xin Jin, Yangguang Li, and Chun Yuan. 2017. Depth map super-resolution via low-resolution depth guided joint trilateral up-sampling. J. Vis. Commun. Image Repres. 46 (2017), 280--291.Google Scholar
Digital Library
- Ming-Ze Yuan, Lin Gao, Hongbo Fu, and Shihong Xia. 2019. Temporal upsampling of depth maps using a hybrid camera. IEEE Trans. Vis. Comput. Graph. 25, 3 (Mar. 2019), 1591--1602. DOI:https://doi.org/10.1109/TVCG.2018.2812879Google Scholar
Cross Ref
- David J. Zielinski, Hrishikesh M. Rao, Mark A. Sommer, and Regis Kopper. 2015. Exploring the effects of image persistence in low frame rate virtual environments. In Proceedings of the IEEE Virtual Reality Conference (VR’15). IEEE, 19--26.Google Scholar
Cross Ref
Index Terms
Real-time, High-resolution Depth Upsampling on Embedded Accelerators
Recommendations
Comparing Hardware Accelerators in Scientific Applications: A Case Study
Multicore processors and a variety of accelerators have allowed scientific applications to scale to larger problem sizes. We present a performance, design methodology, platform, and architectural comparison of several application accelerators executing ...
3D Tomography Back-Projection Parallelization on Intel FPGAs Using OpenCL
This article deals with the evaluation of FPGAs resurgence for hardware acceleration applied to computed tomography on the back-projection operator used in iterative reconstruction algorithms. We focus our attention on the tools developed by FPGAs ...
Modeling and predicting performance of high performance computing applications on hardware accelerators
Hybrid-core systems speedup applications by offloading certain compute operations that can run faster on hardware accelerators. However, such systems require significant programming and porting effort to gain a performance benefit from the accelerators. ...






Comments