skip to main content
research-article
Open Access

Learning to fly: computational controller design for hybrid UAVs with reinforcement learning

Published:12 July 2019Publication History
Skip Abstract Section

Abstract

Hybrid unmanned aerial vehicles (UAV) combine advantages of multicopters and fixed-wing planes: vertical take-off, landing, and low energy use. However, hybrid UAVs are rarely used because controller design is challenging due to its complex, mixed dynamics. In this paper, we propose a method to automate this design process by training a mode-free, model-agnostic neural network controller for hybrid UAVs. We present a neural network controller design with a novel error convolution input trained by reinforcement learning. Our controller exhibits two key features: First, it does not distinguish among flying modes, and the same controller structure can be used for copters with various dynamics. Second, our controller works for real models without any additional parameter tuning process, closing the gap between virtual simulation and real fabrication. We demonstrate the efficacy of the proposed controller both in simulation and in our custom-built hybrid UAVs (Figure 1, 8). The experiments show that the controller is robust to exploit the complex dynamics when both rotors and wings are active in flight tests.

References

  1. Pieter Abbeel, Adam Coates, and Andrew Y. Ng. 2010. Autonomous Helicopter Aerobatics through Apprenticeship Learning. The International Journal of Robotics Research 29, 13 (Nov. 2010), 1608--1639. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Pieter Abbeel, Adam Coates, Morgan Quigley, and Andrew Y. Ng. 2007. An Application of Reinforcement Learning to Aerobatic Helicopter Flight. In Proceedings of the 19th International Conference on Neural Information Processing Systems (NIPS '07). 1--8. http://dl.acm.org/citation.cfm?id=2976456.2976457 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, et al. 2018. Learning Dexterous In-Hand Manipulation. Retrieved April 22, 2019 from https://arxiv.org/abs/1808.00177Google ScholarGoogle Scholar
  4. ArduPilot. 2016. ArduPilot Open Source Autopilot. Retrieved April 22, 2019 from http://ardupilot.org/Google ScholarGoogle Scholar
  5. Karl Johan Aström and Richard M. Murray. 2010. Feedback Systems: an Introduction for Scientists and Engineers. Princeton University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Somil Bansal, Anayo K. Akametalu, Frank J. Jiang, Forrest Laine, and Claire J. Tomlin. 2016. Learning Quadrotor Dynamics Using Neural Network for Flight Control. In Proceedings of 2016 IEEE 55th Conference on Decision and Control (CDC '16). 4653--4660.Google ScholarGoogle Scholar
  7. Roman Bapst, Robin Ritz, Lorenz Meier, and Marc Pollefeys. 2015. Design and Implementation of an Unmanned Tail-sitter. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '15). IEEE, 1885--1890.Google ScholarGoogle ScholarCross RefCross Ref
  8. Andrew Barry. 2016. High-Speed Autonomous Obstacle Avoidance with Pushbroom Stereo. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA.Google ScholarGoogle Scholar
  9. Gaurav Bharaj, David I. W. Levin, James Tompkin, Yun Fei, Hanspeter Pfister, Wojciech Matusik, and Changxi Zheng. 2015. Computational Design of Metallophone Contact Sounds. ACM Trans. Graph. 34, 6, Article 223 (Oct. 2015), 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Adam Coates, Pieter Abbeel, and Andrew Y. Ng. 2008. Learning for Control from Multiple Demonstrations. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). ACM, New York, NY, USA, 144--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Stelian Coros, Bernhard Thomaszewski, Gioacchino Noris, Shinjiro Sueda, Moira Forberg, Robert W. Sumner, Wojciech Matusik, and Bernd Bickel. 2013. Computational Design of Mechanical Characters. ACM Trans. Graph. 32, 4, Article 83 (July 2013), 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Rick Cory and Russ Tedrake. 2008. Experiments in Fixed-Wing UAV Perching. In AIAA Guidance, Navigation and Control Conference and Exhibit. 7256.Google ScholarGoogle Scholar
  13. Ruta Desai, Ye Yuan, and Stelian Coros. 2017. Computational Abstractions for Interactive Design of Robotic Devices. In 2017 IEEE International Conference on Robotics and Automation (ICRA '17). 1196--1203.Google ScholarGoogle Scholar
  14. Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, and Sergey Levine. 2017. Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer. In 2017 IEEE International Conference on Robotics and Automation (ICRA '17). 2169--2176.Google ScholarGoogle Scholar
  15. Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, Yuhuai Wu, and Peter Zhokhov. 2017. OpenAI Baselines. Retrieved April 22, 2019 from https://github.com/openai/baselinesGoogle ScholarGoogle Scholar
  16. Tao Du, Adriana Schulz, Bo Zhu, Bernd Bickel, and Wojciech Matusik. 2016. Computational Multicopter Design. ACM Trans. Graph. 35, 6, Article 227 (Nov. 2016), 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Justin Fu, Sergey Levine, and Pieter Abbeel. 2016. One-shot Learning of manipulation skills with online dynamics adaptation and neural network priors. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '16). IEEE, 4019--4026.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Moritz Geilinger, Roi Poranne, Ruta Desai, Bernhard Thomaszewski, and Stelian Coros. 2018. Skaterbots: Optimization-based Design and Motion Synthesis for Robotic Creatures with Legs and Wheels. ACM Trans. Graph. 37, 4, Article 160 (July 2018), 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nicolas Heess, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, Ali Eslami, Martin Riedmiller, et al. 2017. Emergence of Locomotion Behaviours in Rich Environments. arXiv preprint arXiv:1707.02286 (2017).Google ScholarGoogle Scholar
  20. R. Hugh Stone, Peter Anderson, Colin Hutchison, Allen Tsai, Peter Gibbens, and K C. Wong. 2008. Flight Testing of the T-Wing Tail-Sitter Unmanned Air Vehicle. Journal of Aircraft - J AIRCRAFT 45 (Mar. 2008), 673--685.Google ScholarGoogle Scholar
  21. Jemin Hwangbo, Inkyu Sa, Roland Siegwart, and Marco Hutter. 2017. Control of a Quadrotor With Reinforcement Learning. IEEE Robotics and Automation Letters 2, 4 (Oct. 2017), 2096--2103.Google ScholarGoogle ScholarCross RefCross Ref
  22. H Jin Kim, Michael I Jordan, Shankar Sastry, and Andrew Y Ng. 2004. Autonomous helicopter flight via reinforcement learning. In Advances in neural information processing systems (NIPS '04). 799--806. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Libin Liu and Jessica Hodgins. 2018. Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning. ACM Trans. Graph. 37, 4, Article 142 (July 2018), 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Vittorio Megaro, Bernhard Thomaszewski, Maurizio Nitti, Otmar Hilliges, Markus Gross, and Stelian Coros. 2015. Interactive Design of 3D-printable Robotic Creatures. ACM Trans. Graph. 34, 6, Article 216 (Oct. 2015), 9 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lorenz Meier, Petri Tanskanen, Lionel Heng, Gim Hee Lee, Friedrich Fraundorfer, and Marc Pollefeys. 2012. PIXHAWK: A Micro Aerial Vehicle Design for Autonomous Flight using Onboard Computer Vision. Autonomous Robots 33, 1--2 (2012), 21--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. OnShape. 2019. https://www.onshape.com/.Google ScholarGoogle Scholar
  27. Atsushi Oosedo, Satoko Abiko, Atsushi Konno, Takuya Koizumi, Tatuya Furui, and Masaru Uchiyama. 2013. Development of a quad rotor tail-sitter VTOL UAV without control surfaces and experimental verification. In 2013 IEEE International Conference on Robotics and Automation (ICRA '13). 317--322.Google ScholarGoogle ScholarCross RefCross Ref
  28. Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018a. Deep-Mimic: Example-guided Deep Reinforcement Learning of Physics-based Character Skills. ACM Trans. Graph. 37, 4, Article 143 (July 2018), 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. 2018b. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization. In 2018 IEEE International Conference on Robotics and Automation (ICRA '18). 1--8.Google ScholarGoogle Scholar
  30. Xue Bin Peng, Glen Berseth, Kangkang Yin, and Michiel Van De Panne. 2017. DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning. ACM Trans. Graph. 36, 4, Article 41 (July 2017), 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Robin Ritz and Raffaello D'Andrea. 2017. A Global Controller for Flying Wing Tailsitter Vehicles. In 2017 IEEE International Conference on Robotics and Automation (ICRA '17). 2731--2738.Google ScholarGoogle ScholarCross RefCross Ref
  32. Fereshteh Sadeghi and Sergey Levine. 2017. CAD2RL: Real Single-Image Flight Without a Single Real Image. In Proceedings of Robotics: Science and Systems. Cambridge, Massachusetts.Google ScholarGoogle ScholarCross RefCross Ref
  33. Charles Schaff, David Yunis, Ayan Chakrabarti, and Matthew R Walter. 2018. Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning. arXiv preprint arXiv:1801.01432 (2018).Google ScholarGoogle Scholar
  34. John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust Region Policy Optimization. In International Conference on Machine Learning (ICML '15). 1889--1897. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347 (2017).Google ScholarGoogle Scholar
  36. Adriana Schulz, Jie Xu, Bo Zhu, Changxi Zheng, Eitan Grinspun, and Wojciech Matusik. 2017. Interactive Design Space Exploration and Optimization for CAD Models. ACM Trans. Graph. 36, 4, Article 157 (July 2017), 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez, and Vincent Vanhoucke. 2018. Sim-to-Real: Learning Agile Locomotion For Quadruped Robots. In Proceedings of Robotics: Science and Systems. Pittsburgh, Pennsylvania.Google ScholarGoogle ScholarCross RefCross Ref
  38. Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. 2017. Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. In 2017 IEEE/RSJ International Conferenceon Intelligent Robots and Systems (IROS '17). IEEE, 23--30.Google ScholarGoogle ScholarCross RefCross Ref
  39. Nobuyuki Umetani, Takeo Igarashi, and Niloy J Mitra. 2012. Guided Exploration of Physically Valid Shapes for Furniture Design. ACM Trans. Graph. 31, 4 (2012), 86--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Nobuyuki Umetani, Yuki Koyama, Ryan Schmidt, and Takeo Igarashi. 2014. Pteromys: Interactive Design and Optimization of Free-formed Free-flight Model Airplanes. ACM Trans. Graph. 33, 4, Article 65 (July 2014), 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Jungdam Won, Jongho Park, Kwanyu Kim, and Jehee Lee. 2017. How to Train Your Dragon: Example-guided Control of Flapping Flight. ACM Trans. Graph. 36, 6, Article 198 (Nov. 2017), 13 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jungdam Won, Jungnam Park, and Jehee Lee. 2018. Aerobatics Control of Flying Creatures via Self-regulated Learning. ACM Trans. Graph. 37, 6, Article 181 (Dec. 2018), 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning to fly: computational controller design for hybrid UAVs with reinforcement learning

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Graphics
          ACM Transactions on Graphics  Volume 38, Issue 4
          August 2019
          1480 pages
          ISSN:0730-0301
          EISSN:1557-7368
          DOI:10.1145/3306346
          Issue’s Table of Contents

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 July 2019
          Published in tog Volume 38, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader