Abstract
Hybrid unmanned aerial vehicles (UAV) combine advantages of multicopters and fixed-wing planes: vertical take-off, landing, and low energy use. However, hybrid UAVs are rarely used because controller design is challenging due to its complex, mixed dynamics. In this paper, we propose a method to automate this design process by training a mode-free, model-agnostic neural network controller for hybrid UAVs. We present a neural network controller design with a novel error convolution input trained by reinforcement learning. Our controller exhibits two key features: First, it does not distinguish among flying modes, and the same controller structure can be used for copters with various dynamics. Second, our controller works for real models without any additional parameter tuning process, closing the gap between virtual simulation and real fabrication. We demonstrate the efficacy of the proposed controller both in simulation and in our custom-built hybrid UAVs (Figure 1, 8). The experiments show that the controller is robust to exploit the complex dynamics when both rotors and wings are active in flight tests.
- Pieter Abbeel, Adam Coates, and Andrew Y. Ng. 2010. Autonomous Helicopter Aerobatics through Apprenticeship Learning. The International Journal of Robotics Research 29, 13 (Nov. 2010), 1608--1639. Google Scholar
Digital Library
- Pieter Abbeel, Adam Coates, Morgan Quigley, and Andrew Y. Ng. 2007. An Application of Reinforcement Learning to Aerobatic Helicopter Flight. In Proceedings of the 19th International Conference on Neural Information Processing Systems (NIPS '07). 1--8. http://dl.acm.org/citation.cfm?id=2976456.2976457 Google Scholar
Digital Library
- Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, et al. 2018. Learning Dexterous In-Hand Manipulation. Retrieved April 22, 2019 from https://arxiv.org/abs/1808.00177Google Scholar
- ArduPilot. 2016. ArduPilot Open Source Autopilot. Retrieved April 22, 2019 from http://ardupilot.org/Google Scholar
- Karl Johan Aström and Richard M. Murray. 2010. Feedback Systems: an Introduction for Scientists and Engineers. Princeton University Press. Google Scholar
Digital Library
- Somil Bansal, Anayo K. Akametalu, Frank J. Jiang, Forrest Laine, and Claire J. Tomlin. 2016. Learning Quadrotor Dynamics Using Neural Network for Flight Control. In Proceedings of 2016 IEEE 55th Conference on Decision and Control (CDC '16). 4653--4660.Google Scholar
- Roman Bapst, Robin Ritz, Lorenz Meier, and Marc Pollefeys. 2015. Design and Implementation of an Unmanned Tail-sitter. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '15). IEEE, 1885--1890.Google Scholar
Cross Ref
- Andrew Barry. 2016. High-Speed Autonomous Obstacle Avoidance with Pushbroom Stereo. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
- Gaurav Bharaj, David I. W. Levin, James Tompkin, Yun Fei, Hanspeter Pfister, Wojciech Matusik, and Changxi Zheng. 2015. Computational Design of Metallophone Contact Sounds. ACM Trans. Graph. 34, 6, Article 223 (Oct. 2015), 13 pages. Google Scholar
Digital Library
- Adam Coates, Pieter Abbeel, and Andrew Y. Ng. 2008. Learning for Control from Multiple Demonstrations. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). ACM, New York, NY, USA, 144--151. Google Scholar
Digital Library
- Stelian Coros, Bernhard Thomaszewski, Gioacchino Noris, Shinjiro Sueda, Moira Forberg, Robert W. Sumner, Wojciech Matusik, and Bernd Bickel. 2013. Computational Design of Mechanical Characters. ACM Trans. Graph. 32, 4, Article 83 (July 2013), 12 pages. Google Scholar
Digital Library
- Rick Cory and Russ Tedrake. 2008. Experiments in Fixed-Wing UAV Perching. In AIAA Guidance, Navigation and Control Conference and Exhibit. 7256.Google Scholar
- Ruta Desai, Ye Yuan, and Stelian Coros. 2017. Computational Abstractions for Interactive Design of Robotic Devices. In 2017 IEEE International Conference on Robotics and Automation (ICRA '17). 1196--1203.Google Scholar
- Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, and Sergey Levine. 2017. Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer. In 2017 IEEE International Conference on Robotics and Automation (ICRA '17). 2169--2176.Google Scholar
- Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, Yuhuai Wu, and Peter Zhokhov. 2017. OpenAI Baselines. Retrieved April 22, 2019 from https://github.com/openai/baselinesGoogle Scholar
- Tao Du, Adriana Schulz, Bo Zhu, Bernd Bickel, and Wojciech Matusik. 2016. Computational Multicopter Design. ACM Trans. Graph. 35, 6, Article 227 (Nov. 2016), 10 pages. Google Scholar
Digital Library
- Justin Fu, Sergey Levine, and Pieter Abbeel. 2016. One-shot Learning of manipulation skills with online dynamics adaptation and neural network priors. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '16). IEEE, 4019--4026.Google Scholar
Digital Library
- Moritz Geilinger, Roi Poranne, Ruta Desai, Bernhard Thomaszewski, and Stelian Coros. 2018. Skaterbots: Optimization-based Design and Motion Synthesis for Robotic Creatures with Legs and Wheels. ACM Trans. Graph. 37, 4, Article 160 (July 2018), 12 pages. Google Scholar
Digital Library
- Nicolas Heess, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, Ali Eslami, Martin Riedmiller, et al. 2017. Emergence of Locomotion Behaviours in Rich Environments. arXiv preprint arXiv:1707.02286 (2017).Google Scholar
- R. Hugh Stone, Peter Anderson, Colin Hutchison, Allen Tsai, Peter Gibbens, and K C. Wong. 2008. Flight Testing of the T-Wing Tail-Sitter Unmanned Air Vehicle. Journal of Aircraft - J AIRCRAFT 45 (Mar. 2008), 673--685.Google Scholar
- Jemin Hwangbo, Inkyu Sa, Roland Siegwart, and Marco Hutter. 2017. Control of a Quadrotor With Reinforcement Learning. IEEE Robotics and Automation Letters 2, 4 (Oct. 2017), 2096--2103.Google Scholar
Cross Ref
- H Jin Kim, Michael I Jordan, Shankar Sastry, and Andrew Y Ng. 2004. Autonomous helicopter flight via reinforcement learning. In Advances in neural information processing systems (NIPS '04). 799--806. Google Scholar
Digital Library
- Libin Liu and Jessica Hodgins. 2018. Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning. ACM Trans. Graph. 37, 4, Article 142 (July 2018), 14 pages. Google Scholar
Digital Library
- Vittorio Megaro, Bernhard Thomaszewski, Maurizio Nitti, Otmar Hilliges, Markus Gross, and Stelian Coros. 2015. Interactive Design of 3D-printable Robotic Creatures. ACM Trans. Graph. 34, 6, Article 216 (Oct. 2015), 9 pages. Google Scholar
Digital Library
- Lorenz Meier, Petri Tanskanen, Lionel Heng, Gim Hee Lee, Friedrich Fraundorfer, and Marc Pollefeys. 2012. PIXHAWK: A Micro Aerial Vehicle Design for Autonomous Flight using Onboard Computer Vision. Autonomous Robots 33, 1--2 (2012), 21--39. Google Scholar
Digital Library
- OnShape. 2019. https://www.onshape.com/.Google Scholar
- Atsushi Oosedo, Satoko Abiko, Atsushi Konno, Takuya Koizumi, Tatuya Furui, and Masaru Uchiyama. 2013. Development of a quad rotor tail-sitter VTOL UAV without control surfaces and experimental verification. In 2013 IEEE International Conference on Robotics and Automation (ICRA '13). 317--322.Google Scholar
Cross Ref
- Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018a. Deep-Mimic: Example-guided Deep Reinforcement Learning of Physics-based Character Skills. ACM Trans. Graph. 37, 4, Article 143 (July 2018), 14 pages. Google Scholar
Digital Library
- Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. 2018b. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization. In 2018 IEEE International Conference on Robotics and Automation (ICRA '18). 1--8.Google Scholar
- Xue Bin Peng, Glen Berseth, Kangkang Yin, and Michiel Van De Panne. 2017. DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning. ACM Trans. Graph. 36, 4, Article 41 (July 2017), 13 pages. Google Scholar
Digital Library
- Robin Ritz and Raffaello D'Andrea. 2017. A Global Controller for Flying Wing Tailsitter Vehicles. In 2017 IEEE International Conference on Robotics and Automation (ICRA '17). 2731--2738.Google Scholar
Cross Ref
- Fereshteh Sadeghi and Sergey Levine. 2017. CAD2RL: Real Single-Image Flight Without a Single Real Image. In Proceedings of Robotics: Science and Systems. Cambridge, Massachusetts.Google Scholar
Cross Ref
- Charles Schaff, David Yunis, Ayan Chakrabarti, and Matthew R Walter. 2018. Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning. arXiv preprint arXiv:1801.01432 (2018).Google Scholar
- John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust Region Policy Optimization. In International Conference on Machine Learning (ICML '15). 1889--1897. Google Scholar
Digital Library
- John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347 (2017).Google Scholar
- Adriana Schulz, Jie Xu, Bo Zhu, Changxi Zheng, Eitan Grinspun, and Wojciech Matusik. 2017. Interactive Design Space Exploration and Optimization for CAD Models. ACM Trans. Graph. 36, 4, Article 157 (July 2017), 14 pages. Google Scholar
Digital Library
- Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez, and Vincent Vanhoucke. 2018. Sim-to-Real: Learning Agile Locomotion For Quadruped Robots. In Proceedings of Robotics: Science and Systems. Pittsburgh, Pennsylvania.Google Scholar
Cross Ref
- Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. 2017. Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. In 2017 IEEE/RSJ International Conferenceon Intelligent Robots and Systems (IROS '17). IEEE, 23--30.Google Scholar
Cross Ref
- Nobuyuki Umetani, Takeo Igarashi, and Niloy J Mitra. 2012. Guided Exploration of Physically Valid Shapes for Furniture Design. ACM Trans. Graph. 31, 4 (2012), 86--1. Google Scholar
Digital Library
- Nobuyuki Umetani, Yuki Koyama, Ryan Schmidt, and Takeo Igarashi. 2014. Pteromys: Interactive Design and Optimization of Free-formed Free-flight Model Airplanes. ACM Trans. Graph. 33, 4, Article 65 (July 2014), 10 pages. Google Scholar
Digital Library
- Jungdam Won, Jongho Park, Kwanyu Kim, and Jehee Lee. 2017. How to Train Your Dragon: Example-guided Control of Flapping Flight. ACM Trans. Graph. 36, 6, Article 198 (Nov. 2017), 13 pages. Google Scholar
Digital Library
- Jungdam Won, Jungnam Park, and Jehee Lee. 2018. Aerobatics Control of Flying Creatures via Self-regulated Learning. ACM Trans. Graph. 37, 6, Article 181 (Dec. 2018), 10 pages. Google Scholar
Digital Library
Index Terms
Learning to fly: computational controller design for hybrid UAVs with reinforcement learning
Recommendations
Fly like a fly
A team of scientists has built a fly-size flight simulator in an attempt to understand flight control from the perspective of the common housefly. Results so far have shown, and further tests are expected to confirm, that the fly uses a flight control ...
Learning from Innate Behaviors: A Quantitative Evaluation of Neural Network Controllers
Special issue on learning in autonomous robotsThe aim was to investigate a method of developing mobile robot controllers based on ideas about how plastic neural systems adapt to their environment by extracting regularities from the amalgamated behavior of inflexible (non-plastic) innate s ubsystems ...
Learning from Innate Behaviors: A QuantitativeEvaluation of Neural Network Controllers
The aim was to investigate a method of developing mobile robot controllers based on ideas about how plastic neural systems adapt to their environment by extracting regularities from the amalgamated behavior of inflexible (nonplastic) innate subsystems ...





Comments