Abstract
Visual Simultaneous Localization and Mapping (vSLAM) is the method of employing an optical sensor to map the robot’s observable surroundings while also identifying the robot’s pose in relation to that map. The accuracy and speed of vSLAM calculations can have a very significant impact on the performance and effectiveness of subsequent tasks that need to be executed by the robot, making it a key building component for current robotic designs. The application of vSLAM in the area of humanoid robotics is particularly difficult due to the robot’s unsteady locomotion. This paper introduces a pose graph optimization module based on RGB (ORB) features, as an extension of the KinectFusion pipeline (a well-known vSLAM algorithm), to assist in recovering the robot’s stance during unstable gait patterns when the KinectFusion tracking system fails. We develop and test a wide range of embedded MPSoC FPGA designs, and we investigate numerous architectural improvements, both precise and approximation, to study their impact on performance and accuracy. Extensive design space exploration reveals that properly designed approximations, which exploit domain knowledge and efficient management of CPU and FPGA fabric resources, enable real-time vSLAM at more than 30 fps in humanoid robots with high energy-efficiency and without compromising robot tracking and map construction. This is the first FPGA design to achieve robust, real-time dense SLAM operation targeting specifically humanoid robots. An open source release of our implementations and data can be found in [1].
- [1] Oct 2021. https://github.com/csl-uth/PG-SLAM_fpga. (Oct 2021).
DOI: Google ScholarCross Ref
- [2] . 2018. Embedding SLAM algorithms: Has it come of age? Robotics and Autonomous Systems 100 (2018), 14–26.Google Scholar
Cross Ref
- [3] . 2015. Decentralized active information acquisition: Theory and application to multi-robot SLAM. IEEE International Conference on Robotics and Automation (ICRA) (2015), 4775–4782.Google Scholar
Cross Ref
- [4] . 2016. Blur image detection using Laplacian operator and Open-CV. 2016 International Conference System Modeling & Advancement in Research Trends (SMART) (2016), 63–67.Google Scholar
Cross Ref
- [5] . 1992. A method for registration of 3-D shapes. IEEE Trans. Pattern Analysis and Machine Intelligence 14, 2 (1992).Google Scholar
Digital Library
- [6] . 2018. SLAMBench2: Multi-objective head-to-head benchmarking for visual SLAM. CoRR abs/1808.06820.Google Scholar
- [7] . 2016. Semi-dense SLAM on an FPGA SoC. In 26th International Conference on Field Programmable Logic and Applications, (FPL), Lausanne, Switzerland, August 29–September 2, 2016, , , , , and (Eds.). IEEE, 1–4.Google Scholar
Cross Ref
- [8] . 2017. A high-performance system-on-chip architecture for direct tracking for SLAM. In 27th International Conference on Field Programmable Logic and Applications, (FPL), Ghent, Belgium, September 4–8. 1–7.Google Scholar
Cross Ref
- [9] . 2019. A scalable FPGA-based architecture for depth estimation in SLAM. In 15th International Symposium on Applied Reconfigurable Computing, (ARC) Darmstadt, Germany, April 9–11. 181–196.Google Scholar
- [10] . 2009. A floating-point extended Kalman filter implementation for autonomous mobile robots. J. Signal Process. Syst. 56, 1 (2009), 41–50.Google Scholar
Digital Library
- [11] . 2017. Simultaneous localization and mapping: A survey of current trends in autonomous driving. IEEE Transactions on Intelligent Vehicles 2 (2017), 194–220.Google Scholar
Cross Ref
- [12] . 2003. Humanoid robots. In Encyclopedia of Physical Science and Technology (Third Edition), (Ed.). Academic Press, New York, 401–425.Google Scholar
- [13] . 2019. SLAMBench 3.0: Systematic automated reproducible evaluation of SLAM systems for robot vision challenges and scene understanding. In International Conference on Robotics and Automation, ICRA 2019, Montreal, QC, Canada, May 20–24, 2019. IEEE, 6351–6358.Google Scholar
Digital Library
- [14] . 2016. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics 32, 6 (2016), 1309–1332.Google Scholar
Digital Library
- [15] . 1996. A volumetric method for building complex models from range images. In 23rd Annual Conference on Computer Graphics and Interactive Techniques, (SIGGRAPH), New Orleans, LA, USA, August 4–9, 1996. 303–312.Google Scholar
Digital Library
- [16] . 2018. SOFT-SLAM: Computationally efficient stereo visual simultaneous localization and mapping for autonomous unmanned aerial vehicles. J. Field Robotics 35 (2018), 578–595.Google Scholar
Cross Ref
- [17] . 2007. MonoSLAM: Real-time single camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (2007), 1052–1067.Google Scholar
Digital Library
- [18] . 2018. SuperPoint: Self-supervised interest point detection and description. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2018, Salt Lake City, UT, USA, June 18–22, 2018. 224–236.Google Scholar
Cross Ref
- [19] . 2014. LSD-SLAM: Large-scale direct monocular SLAM. In Computer Vision - ECCV 2014–13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part II (Lecture Notes in Computer Science), Vol. 8690. Springer, 834–849.Google Scholar
Cross Ref
- [20] . 2017. FPGA-based ORB feature extraction for real-time visual SLAM. In International Conference on Field Programmable Technology, (FPT), Melbourne, Australia, December 11–13, 2017. 275–278.Google Scholar
Cross Ref
- [21] . 2021. Energy-efficient FPGA-accelerated LiDAR-based SLAM for embedded robotics. In International Conference on Field-Programmable Technology, (FPT) Auckland, New Zealand, December 6–10, 2021. IEEE, 1–6.Google Scholar
Cross Ref
- [22] . 2013. Collaborative monocular SLAM with multiple Micro Aerial Vehicles. 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (2013), 3962–3970.Google Scholar
- [23] . 2019. FPGA architectures for real-time dense SLAM. In 30th IEEE International Conference on Application-specific Systems, Architectures and Processors, (ASAP), New York, NY, USA, July 15–17, 2019. 83–90.Google Scholar
Cross Ref
- [24] . 2014. Real-time 3D reconstruction for FPGAs: A case study for evaluating the performance, area, and programmability trade-offs of the Altera OpenCL SDK. In 2014 International Conference on Field-Programmable Technology, FPT Shanghai, China, December 10–12, 2014. 326–329.Google Scholar
Cross Ref
- [25] . 2010. Multi-robot visual SLAM using a Rao-Blackwellized particle filter. Robotics Auton. Syst. 58 (2010), 68–80.Google Scholar
Digital Library
- [26] . 2021. FPGA architectures for approximate dense SLAM computing. In 24th Conference on Design, Automation and Test in Europe (DATE) Virtual Conference, February 1–3, 2021.Google Scholar
Cross Ref
- [27] . 2015. An FPGA-based real-time simultaneous localization and mapping system. In International Conference on Field Programmable Technology, (FPT) Queenstown, New Zealand, December 7–9, 2015. 200–203.Google Scholar
Cross Ref
- [28] . 2014. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In ICRA Hong Kong, China, May.Google Scholar
- [29] . 2010. Humanoid robot localization in complex indoor environments. IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings, 1690–1695.
DOI: Google ScholarCross Ref
- [30] . 2011. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. UIST’11 - Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, 559–568.Google Scholar
Digital Library
- [31] . 2015. Very high frame rate volumetric integration of depth images on mobile devices. IEEE Trans. Vis. Comput. Graph. 21, 11 (2015).Google Scholar
Digital Library
- [32] . 2013. Dense visual SLAM for RGB-D cameras. 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (2013), 2100–2106.Google Scholar
- [33] . 2011. G2o: A general framework for graph optimization. Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA), 3607–3613.
DOI: Google ScholarCross Ref
- [34] . 2019. Survey and evaluation of monocular visual-inertial SLAM algorithms for augmented reality. Virtual Real. Intell. Hardw. 1 (2019), 386–410.Google Scholar
Cross Ref
- [35] . 2018. ICE-BA: Incremental, consistent and efficient bundle adjustment for visual-inertial SLAM. In IEEE Conference on Computer Vision and Pattern Recognition, (CVPR), Salt Lake City, UT, USA, June 18–22, 2018. Computer Vision Foundation / IEEE Computer Society, 1974–1982.Google Scholar
Cross Ref
- [36] . 2019. eSLAM: An energy-efficient accelerator for real-time ORB-SLAM on FPGA platform. In 56th Annual Design Automation Conference, (DAC), Las Vegas, NV, USA, June 02–06, 2019. ACM, 193.Google Scholar
Digital Library
- [37] . 2014. Characterizations of noise in Kinect depth images: A review. IEEE Sensors Journal 14, 6 (2014), 1731–1740.Google Scholar
Cross Ref
- [38] . 2016. A survey of techniques for approximate computing. ACM Comput. Surv. 48, 4 (2016).Google Scholar
Digital Library
- [39] . 2003. FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In IJCAI.Google Scholar
- [40] . 2009. Fast approximate nearest neighbors with automatic algorithm configuration. VISAPP 2009 - Proceedings of the 4th International Conference on Computer Vision Theory and Applications 1, 331–340.Google Scholar
- [41] . 2015. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robotics 31, 5 (2015).Google Scholar
Digital Library
- [42] . 2017. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics 33 (2017), 1255–1262.Google Scholar
Digital Library
- [43] . 2015. Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM. In International Conference on Robotics and Automation, (ICRA), Seattle, WA, USA, 26–30 May. 5783–5790.Google Scholar
Cross Ref
- [44] . 2014. A synchronized visual-inertial sensor system with FPGA pre-processing for accurate real-time SLAM. In International Conference on Robotics and Automation, (ICRA), Hong Kong, China, May 31–June 7, 2014.Google Scholar
Cross Ref
- [45] . 2016. Energy-efficient simultaneous localization and mapping via compounded approximate computing. In IEEE International Workshop on Signal Processing Systems (SiPS), Dallas, TX, USA, October 26–28, 2016.Google Scholar
- [46] . 2012. Vision-based odometric localization for humanoids using a kinematic EKF. In 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012). 153–158.Google Scholar
Cross Ref
- [47] . 2005. Using visual odometry to create 3D maps for online footstep planning. In 2005 IEEE International Conference on Systems, Man and Cybernetics, Vol. 3. 2643–2648.Google Scholar
Cross Ref
- [48] . 2019. SLAMBooster: An application-aware online controller for approximation in dense SLAM. In 28th International Conference on Parallel Architectures and Compilation Techniques, (PACT), Seattle, WA, USA, September 23–26, 2019.Google Scholar
Cross Ref
- [49] . 2020. A methodology for principled approximation in visual SLAM. In International Conference on Parallel Architectures and Compilation Techniques (PACT), Virtual Event, GA, USA, October 3–7, 2020. 373–386.Google Scholar
Digital Library
- [50] . 2019. Humanoid Robot Dense RGB-D SLAM for Embedded Devices. (
04 2019).DOI: Google ScholarCross Ref
- [51] . 2011. ORB: An efficient alternative to SIFT or SURF. Proceedings of the IEEE International Conference on Computer Vision, 2564–2571.Google Scholar
- [52] . 2018. Navigating the landscape for real-time localization and mapping for robotics and virtual and augmented reality. Proc. IEEE 106, 11 (2018), 2020–2039.Google Scholar
Cross Ref
- [53] . 2013. SLAM++: Simultaneous localisation and mapping at the level of objects. 2013 IEEE Conference on Computer Vision and Pattern Recognition (2013), 1352–1359.Google Scholar
Digital Library
- [54] . 2017. Multi-UAV collaborative monocular SLAM. 2017 IEEE International Conference on Robotics and Automation (ICRA) (2017), 3863–3870.Google Scholar
Digital Library
- [55] . 2019. BAD SLAM: Bundle adjusted direct RGB-D SLAM. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019), 134–144.Google Scholar
Cross Ref
- [56] . 2011. Managing performance vs. accuracy trade-offs with loop perforation. In ESEC/FSE, Szeged, Hungary, Sept. 2011.Google Scholar
- [57] . 2015. Discriminative learning of deep convolutional feature point descriptors. In 2015 IEEE International Conference on Computer Vision, ICCV Santiago, Chile, December 7–13, 2015. 118–126.Google Scholar
Digital Library
- [58] . 1987. Estimating uncertain spatial relationships in robotics. Proceedings. 1987 IEEE International Conference on Robotics and Automation 4 (1987), 850–850.Google Scholar
Cross Ref
- [59] . 2019. Navion: A 2-mW fully integrated real-time visual-inertial odometry accelerator for autonomous navigation of nano drones. IEEE Journal of Solid-State Circuits 54, 4 (2019).Google Scholar
Cross Ref
- [60] . 2004. 3D map building for a humanoid robot by using visual odometry. In 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583), Vol. 5. 4444–4449.
DOI: Google ScholarCross Ref
- [61] . 2018. A comparative analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK. International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (2018), 1–10.Google Scholar
- [62] . 2014. FPGA design and implementation of a matrix multiplier based accelerator for 3D EKF SLAM. In International Conference on ReConFigurable Computing and FPGAs, ReConFig14, Cancun, Mexico, December 8–10, 2014. IEEE, 1–6.Google Scholar
Cross Ref
- [63] . 2016. FPGA design of EKF block accelerator for 3D visual SLAM. Comput. Electr. Eng. 55 (2016), 123–137.Google Scholar
Digital Library
- [64] . 2006. Localisation for autonomous humanoid navigation. In 2006 6th IEEE-RAS International Conference on Humanoid Robots. 13–19.Google Scholar
Cross Ref
- [65] . 1996. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society (Series B) 58 (1996).Google Scholar
- [66] . 1998. Bilateral filtering for gray and color images. In 6th International Conference on Computer Vision (ICCV), Bombay, India, January 4–7, 1998.Google Scholar
Cross Ref
- [67] . 2012. Real time simultaneous localization and mapping: Towards low-cost multiprocessor embedded systems. EURASIP J. Embed. Syst. 2012 (2012), 5.Google Scholar
Cross Ref
- [68] . 2015. ElasticFusion: Dense SLAM without a pose graph. In Robotics: Science and Systems.Google Scholar
- [69] . 2013. Humanoid robot navigation: From a visual SLAM to a visual compass. In 10th IEEE International Conference on Networking, Sensing and Control, ICNSC 2013, Evry, France, April 10–12, 2013. 678–683.Google Scholar
Cross Ref
- [70] . 2020. CNN-based feature-point extraction for real-time visual SLAM on embedded FPGA. In 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, (FCCM), Fayetteville, AR, USA, May 3–6, 2020. 33–37.Google Scholar
- [71] . 2018. Dense RGB-D SLAM for humanoid robots in the dynamic humans environment. In IEEE-RAS International Conference on Humanoid Robots, Humanoids. Beijing, China, November 6–9, 2018. 270–276.Google Scholar
Digital Library
Index Terms
Reconfigurable System-on-Chip Architectures for Robust Visual SLAM on Humanoid Robots
Recommendations
FPGA Accelerators for Robust Visual SLAM on Humanoid Robots
FPGA '22: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysVisual Simultaneous Localization and Mapping (vSLAM) is the process of mapping the robot's observed environment using an optical sensor, while concurrently determining the robot's pose with respect to that map. For humanoid robots, the implementation of ...
Development of a humanoid robot
This study presents design methodologies, specifications and control strategies for vision-guided object grasping for the developed humanoid robot, Cheng-kung Humanoid RobotIc System (CHRIS). The humanoid robot constructed herein comprises mainly a ...
Vision-based maze navigation for humanoid robots
We present a vision-based approach for navigation of humanoid robots in networks of corridors connected through curves and junctions. The objective of the humanoid is to follow the corridors, walking as close as possible to their center to maximize ...






Comments