Abstract
Deep learning (DL) presents new opportunities for enabling spacecraft autonomy, onboard analysis, and intelligent applications for space missions. However, DL applications are computationally intensive and often infeasible to deploy on radiation-hardened (rad-hard) processors, which traditionally harness a fraction of the computational capability of their commercial-off-the-shelf counterparts. Commercial FPGAs and system-on-chips present numerous architectural advantages and provide the computation capabilities to enable onboard DL applications; however, these devices are highly susceptible to radiation-induced single-event effects (SEEs) that can degrade the dependability of DL applications. In this article, we propose Reconfigurable ConvNet (RECON), a reconfigurable acceleration framework for dependable, high-performance semantic segmentation for space applications. In RECON, we propose both selective and adaptive approaches to enable efficient SEE mitigation. In our selective approach, control-flow parts are selectively protected by triple-modular redundancy to minimize SEE-induced hangs, and in our adaptive approach, partial reconfiguration is used to adapt the mitigation of dataflow parts in response to a dynamic radiation environment. Combined, both approaches enable RECON to maximize system performability subject to mission availability constraints. We perform fault injection and neutron irradiation to observe the susceptibility of RECON and use dependability modeling to evaluate RECON in various orbital case studies to demonstrate a 1.5–3.0× performability improvement in both performance and energy efficiency compared to static approaches.
- Dimitris Agiakatsikas, Nguyen T. H. Nguyen, Zhuoran Zhao, Tong Wu, Ediz Cetin, Oliver Diessel, and Lingkan Gong. 2016. Reconfiguration control networks for TMR systems with module-based recovery. In Proceedings of the 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM'16). 88–91. https://doi.org/10.1109/FCCM.2016.30Google Scholar
Cross Ref
- Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 12(2017), 2481–2495.Google Scholar
Cross Ref
- Fabio Benevenuti, Fabiano Libano, Vincent Pouget, Fernanda Lima Kastensmidt, and Paolo Rech. 2018. Comparative analysis of inference errors in a neural network implemented in SRAM-based FPGA induced by neutron irradiation and fault injection methods. In Proceedings of the 2018 31st Symposium on Integrated Circuits and Systems Design (SBCCI'18). 1–6. https://doi.org/10.1109/SBCCI.2018.8533235Google Scholar
Cross Ref
- Melanie Berg, Christian Poivey, David Petrick, Daniel Espinosa, Austin Lesea, Kenneth A. LaBel, Mark Friendlich, Hak Kim, and Anthony Phan. 2008. Effectiveness of internal versus external SEU scrubbing mitigation strategies in a Xilinx FPGA: Design, test, and analysis. IEEE Trans. Nucl. Sci. 55, 4 (Aug. 2008), 2259–2266. https://doi.org/10.1109/TNS.2008.2001422Google Scholar
Cross Ref
- Cristiana Bolchini, Antonio Miele, and Marco D. Santambrogio. 2007. TMR and partial dynamic reconfiguration to mitigate SEU faults in FPGAs. In Proceedings of the 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT'07). 87–95. https://doi.org/10.1109/DFT.2007.25 Google Scholar
Digital Library
- Sébastien Bourdarie and Michael Xapsos. 2008. The near-earth space radiation environment. IEEE Trans. Nucl. Sci. 55, 4 (Aug 2008), 1810–1832. https://doi.org/10.1109/TNS.2008.2001409Google Scholar
Cross Ref
- Cody Brewer, Nicholas Franconi, Robin Ripley, Alessandro Geist, Travis Wise, Sebastian Sabogal, Gary Crum, Sabrena Heyward, and Christopher Wilson. 2020. NASA SpaceCube intelligent multi-purpose system for enabling remote sensing, communication, and navigation in mission architectures. In Proceedings of the 34th Annual AIAA/USU Conference on Small Satellites. AIAA, 1–6.Google Scholar
- Michael J. Campola and Jonathan A. Pellish. 2019. Radiation hardness assurance: Evolving for newspace. In Proceedings of the 2019 RADiations Effects on Components and Systems (RADECS'19) Short Course, Part V. 1–35.Google Scholar
- BAA DARPA. 2018. Blackjack (BAA HR001118S0032). DARPA.Google Scholar
- BAA DARPA. 2019. Blackjack Pit Boss (BAA HR001119S0012). DARPA.Google Scholar
- Fernando Fernandes dos Santos, Caio Lunardi, Daniel Oliveira, Fabiano Libano, and Paolo Rech. 2019. Reliability evaluation of mixed-precision architectures. In Proceedings of the 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA'19). 238–249. https://doi.org/10.1109/HPCA.2019.00041Google Scholar
Cross Ref
- Boyang Du, Sarah Azimi, Corrado de Sio, Ludovica Bozzoli, and Luca Sterpone. 2019. On the reliability of convolutional neural network implementation on SRAM-based FPGA. In Proceedings of the 2019 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT'19). 1–6. https://doi.org/10.1109/DFT.2019.8875362Google Scholar
Cross Ref
- Giulio Gambardella, Johannes Kappauf, Michaela Blott, Christoph Doehring, Martin Kumm, Peter Zipf, and Kees Vissers. 2019. Efficient error-tolerant quantized neural network accelerators. In Proceedings of the 2019 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT'19). 1–6. https://doi.org/10.1109/DFT.2019.8875314Google Scholar
Cross Ref
- Alberto Garcia-Garcia, Sergio Orts-Escolano, Sergiu Oprea, Victor Villena-Martinez, Pablo Martinez-Gonzalez, and Jose Garcia-Rodriguez. 2018. A survey on deep learning techniques for image and video semantic segmentation. Appl. Soft Comput. 70 (2018), 41–65. https://doi.org/10.1016/j.asoc.2018.05.018Google Scholar
Cross Ref
- Alessandro Geist, Cody Brewer, Milton Davis, Nicholas Franconi, Sabrena Heyward, Travis Wise, Gary Crum, David Petrick, Robin Ripley, Christopher Wilson, and Thomas Flatley. 2019. SpaceCube v3.0 NASA next-generation high-performance processor for science applications. In Proceedings of the 33rd Annual AIAA/USU Conference on Small Satellites. AIAA, 1–9.Google Scholar
- Alan D. George and Christopher M. Wilson. 2018. Onboard processing with hybrid and reconfigurable computing on small satellites. Proc. IEEE 106, 3 (Mar. 2018), 458–470. https://doi.org/10.1109/JPROC.2018.2802438Google Scholar
Cross Ref
- Kaiyuan Guo, Shulin Zeng, Jincheng Yu, Yu Wang, and Huazhong Yang. 2019. [DL] A survey of FPGA-based neural network inference accelerators. ACM Trans. Reconfig. Technol. Syst. 12, 1, Article 2(Mar. 2019), 26 pages. https://doi.org/10.1145/3289185 Google Scholar
Digital Library
- Felix R. Hoots and Ronald L. Roehrich. 1980. Models for Propagation of NORAD Element Sets. Technical Report. Aerospace Defense Command Peterson AFB, Office of Astrodynamics.Google Scholar
- Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2704–2713.Google Scholar
Cross Ref
- Adam Jacobs, Grzegorz Cieslewski, Alan D. George, Ann Gordon-Ross, and Herman Lam. 2012. Reconfigurable fault tolerance: A comprehensive framework for reliable and adaptive FPGA-based space computing. ACM Trans. Reconfig. Technol. Syst. 5, 4, Article 21(Dec. 2012), 30 pages. https://doi.org/10.1145/2392616.2392619 Google Scholar
Digital Library
- Jonathan M. Johnson and Michael J. Wirthlin. 2010. Voter insertion algorithms for FPGA designs using triple modular redundancy. In Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'10). ACM, New York, NY, 249–258. https://doi.org/10.1145/1723112.1723154 Google Scholar
Digital Library
- Kenneth A. LaBel and Jonathan A. Pellish. 2014. National radiation hardness assurance (RHA) planning for NASA missions: Updated guidance. NASA Electronic Parts and Packaging Program (NEPP) (March 2014).Google Scholar
- Fahad Lateef and Yassine Ruichek. 2019. Survey on semantic segmentation using deep learning techniques. Neurocomputing 338(2019), 321–348. https://doi.org/10.1016/j.neucom.2019.02.003Google Scholar
Digital Library
- Andrew Lavin and Scott Gray. 2016. Fast algorithms for convolutional neural networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). 4013–4021.Google Scholar
Cross Ref
- Robert Le. 2012. Soft error mitigation using prioritized essential bits. Xilinx XAPP538 (v1. 0).Google Scholar
- Fabiano Libano, Brittany Wilson, Jon-Paul Anderson, Michael J. Wirthlin, Carlo Cazzaniga, Christopher Frost, and Paolo Rech. 2019. Selective hardening for neural networks in FPGAs. IEEE Trans. Nucl. Sci. 66, 1 (Jan. 2019), 216–222. https://doi.org/10.1109/TNS.2018.2884460Google Scholar
Cross Ref
- Fabiano Libano, Brittany Wilson, Michael Wirthlin, Paolo Rech, and John Brunhaver. 2020. Understanding the impact of quantization, accuracy, and radiation on the reliability of convolutional neural networks on FPGAs. IEEE Trans. Nucl. Sci. 67, 7 (Jul. 2020), 1478–1484. https://doi.org/10.1109/TNS.2020.2983662Google Scholar
Cross Ref
- Tyler M. Lovelly and Alan D. George. 2017. Comparative analysis of present and future space-grade processors with device metrics. J. Aerosp. Inf. Syst. 14, 3 (01 Mar. 2017), 184–197. https://doi.org/10.2514/1.I010472Google Scholar
- David J. Miranda. 2020. 2020 NASA technology taxonomy: 2015 Technology areas to 2020 taxonomy areas crosswalk (HQ-E-DAA-TN76653). NASA.Google Scholar
- Sparsh Mittal. 2020. A survey of FPGA-based accelerators for convolutional neural networks. Neural Comput. Appl. 32, 4 (01 Feb. 2020), 1109–1139. https://doi.org/10.1007/s00521-018-3761-1Google Scholar
- Shubhendu S. Mukherjee, Christopher Weaver, Joel Emer, Steven K. Reinhardt, and Todd Austin. 2003. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-36).29–40. https://doi.org/10.1109/MICRO.2003.1253181 Google Scholar
Digital Library
- National Academies of Sciences, Engineering, and Medicine. 2016. Achieving Science with CubeSats: Thinking Inside the Box. The National Academies Press, Washington, DC. https://doi.org/10.17226/23503Google Scholar
- National Academies of Sciences, Engineering, and Medicine. 2018. Testing at the Speed of Light: The State of U.S. Electronic Parts Space Radiation Testing Infrastructure. The National Academies Press, Washington, DC. https://doi.org/10.17226/24993Google Scholar
- National Academies of Sciences, Engineering, and Medicine. 2018. Thriving on Our Changing Planet: A Decadal Strategy for Earth Observation from Space. The National Academies Press, Washington, DC. https://doi.org/10.17226/24938Google Scholar
- Suzanne F. Nowicki, Stephen A. Wender, and Michael Mocko. 2017. The Los Alamos Neutron Science Center spallation neutron sources. Phys. Proc. 90(2017), 374–380. https://doi.org/10.1016/j.phpro.2017.09.035Google Scholar
Cross Ref
- ISPRS Potsdam. 2018. 2D Semantic Labeling Dataset. Retrieved from http://www2.isprs.org/commissions/comm3/wg4/2d-sem-label-potsdam.html.Google Scholar
- Heather Quinn. 2014. Challenges in testing complex systems. IEEE Trans. Nucl. Sci. 61, 2 (Apr. 2014), 766–786. https://doi.org/10.1109/TNS.2014.2302432Google Scholar
Cross Ref
- Heather Quinn. 2017. Radiation effects in reconfigurable FPGAs. Semiconduct. Sci. Technol. 32, 4 (Mar. 2017), 1–8. https://doi.org/10.1088/1361-6641/aa57f6Google Scholar
Cross Ref
- George A. Reis, Jonathan Chang, Neil Vachharajani, Shubhendu S. Mukherjee, Ram Rangan, and David I. August. 2005. Design and evaluation of hybrid fault-detection systems. In Proceedings of the 32nd International Symposium on Computer Architecture (ISCA'05). 148–159. https://doi.org/10.1109/ISCA.2005.21 Google Scholar
Digital Library
- Seth Roffe, Theodore Schwarz, Thomas Cook, Noah Perryman, Justin Goodwill, Evan Gretok, Aidan Phillips, Mitchell Moran, Tyler Garrett, and Alan George. 2020. CASPR: Autonomous sensor processing experiment for STP-H7. In Proceedings of the 34th Annual AIAA/USU Conference on Small Satellites. AIAA, 1–11.Google Scholar
- Sebastian Sabogal, Patrick Gauvin, Brad Shea, Daniel Sabogal, Antony Gillette, Christopher Wilson, Ansel Barchowsky, Alan D. George, Gary Crum, and Thomas Flatley. 2017. SSIVP: Spacecraft supercomputing experiment for STP-H6. In Proceedings of the 31st Annual AIAA/USU Conference on Small Satellites. AIAA, 1–12.Google Scholar
- Sebastian Sabogal, Alan George, and Gary Crum. 2019. ReCoN: A reconfigurable CNN acceleration framework for hybrid semantic segmentation on hybrid SoCs for space applications. In Proceedings of the 2019 IEEE Space Computing Conference (SCC'19). 41–52. https://doi.org/10.1109/SpaceComp.2019.00010Google Scholar
Cross Ref
- Sebastian Sabogal, Alan George, and Christopher Wilson. 2020. Reconfigurable framework for environmentally adaptive resilience in hybrid space systems. ACM Trans. Reconfig. Technol. Syst. 13, 3, Article 14(Jul. 2020), 1–32. https://doi.org/10.1145/3398380 Google Scholar
Digital Library
- Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Netw. 61(2015), 85–117. https://doi.org/10.1016/j.neunet.2014.09.003 Google Scholar
Digital Library
- Felix Siegle, Tanya Vladimirova, Jørgen Ilstad, and Omar Emam. 2015. Mitigation of radiation effects in SRAM-based FPGAs for space applications. ACM Comput. Surv. 47, 2, Article 37 (Jan. 2015), 34 pages. https://doi.org/10.1145/2671181 Google Scholar
Digital Library
- Aaron Stoddard, Ammon Gruwell, Peter Zabriskie, and Michael J. Wirthlin. 2017. A hybrid approach to FPGA configuration scrubbing. IEEE Trans. Nucl. Sci. 64, 1 (Jan. 2017), 497–503. https://doi.org/10.1109/TNS.2016.2636666Google Scholar
Cross Ref
- Michael A. Swartout. 2020. CubeSats mission assurance trends. In Proceedings of the NASA Electronic Parts and Packaging (NEPP) Electronics Technology Workshop (ETW'20). NASA GSFC, Greenbelt, MD.Google Scholar
- Lucas A. Tambara, Felipe Almeida, Paolo Rech, Fernanda L. Kastensmidt, Giovanni Bruni, and Christopher Frost. 2015. Measuring failure probability of coarse and fine grain TMR schemes in SRAM-based FPGAs under neutron-induced effects. In Applied Reconfigurable Computing. Springer International Publishing, Cham, 331–338.Google Scholar
- César Torres-Huitzil and Bernard Girau. 2017. Fault tolerance in neural networks: Neural design and hardware implementation. In Proceedings of the 2017 International Conference on ReConFigurable Computing and FPGAs (ReConFig'17). 1–6. https://doi.org/10.1109/RECONFIG.2017.8279793Google Scholar
Cross Ref
- Allan J. Tylka, James H. Adams, Paul R. Boberg, Buddy Brownstein, William F. Dietrich, Erwin O. Flueckiger, Edward L. Petersen, Margaret A. Shea, Don F. Smart, and Edward C. Smith. 1997. CREME96: A revision of the cosmic ray effects on micro-electronics code. IEEE Trans. Nucl. Sci. 44, 6 (Dec. 1997), 2150–2160. https://doi.org/10.1109/23.659030Google Scholar
Cross Ref
- Ingo Wardinski, Diana Saturnino, Hagay Amit, Aude Chambodut, Benoit Langlais, Mioara Mandea, and Thébault Erwan. 2020. Geomagnetic core field models and secular variation forecasts for the 13th International Geomagnetic Reference Field (IGRF-13). Earth Planets Space 72, 1 (22 Oct. 2020), 1–155. https://doi.org/10.1186/s40623-020-01254-7Google Scholar
- Xuechao Wei, Cody Hao Yu, Peng Zhang, Youxiang Chen, Yuxin Wang, Han Hu, Yun Liang, and Jason Cong. 2017. Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs. In Proceedings of the 54th Annual Design Automation Conference (DAC'17). Association for Computing Machinery, New York, NY, Article 29, 6 pages. https://doi.org/10.1145/3061639.3062207 Google Scholar
Digital Library
- Caleb Williams and Stephanie DelPozzo. 2020. 2020 Nano/microsatellite market forecast2020 Nano/Microsatellite Market Forecast, 10th ed. SpaceWorks Enterprises, Inc.Google Scholar
- Christopher Wilson and Alan D. George. 2018. CSP Hybrid space computing. J. Aerosp. Inf. Syst. 15, 4 (2 Feb. 2018), 215–227. https://doi.org/10.2514/1.I010572Google Scholar
- Michael Wirthlin. 2015. High-reliability FPGA-based systems: Space, high-energy physics, and beyond. Proc. IEEE 103, 3 (Mar. 2015), 379–389. https://doi.org/10.1109/JPROC.2015.2404212Google Scholar
Cross Ref
- Michael A. Xapsos, Patrick M. O'Neill, and T. Paul O'Brien. 2013. Near-earth space radiation models. IEEE Trans. Nucl. Sci. 60, 3 (June 2013), 1691–1705. https://doi.org/10.1109/TNS.2012.2225846Google Scholar
Cross Ref
- Xilinx. 2018. Zynq-7000 SoC Technical Reference Manual (v1.12.2 ed.). Xilinx User Guide (UG585).Google Scholar
- Xilinx. 2019. Zynq UltraScale+ Device Technical Reference Manual (v2.1 ed.). Xilinx User Guide (UG1085).Google Scholar
- Zhuoran Zhao, Dimitris Agiakatsikas, Nguyen T. H. Nguyen, Ediz Cetin, and Oliver Diessel. 2016. Fine-grained module-based error recovery in FPGA-based TMR systems. In Proceedings of the 2016 International Conference on Field-Programmable Technology (FPT'16). 101–108. https://doi.org/10.1109/FPT.2016.7929433Google Scholar
Cross Ref
- Zhuoran Zhao, Nguyen T. H. Nguyen, Dimitris Agiakatsikas, Ganghee Lee, Ediz Cetin, and Oliver Diessel. 2018. Fine-grained module-based error recovery in FPGA-based TMR systems. ACM Trans. Reconfig. Technol. Syst. 11, 1, Article 4(Jan. 2018), 1–23 pages. https://doi.org/10.1145/3173549 Google Scholar
Digital Library
Index Terms
Reconfigurable Framework for Resilient Semantic Segmentation for Space Applications
Recommendations
Reconfigurable Framework for Environmentally Adaptive Resilience in Hybrid Space Systems
Due to ongoing innovations in both sensor technology and spacecraft autonomy, onboard space processing continues to be outpaced by the escalating computational demands required for next-generation missions. Commercial-off-the-shelf, hybrid system-on-...
Resilient MPI applications using an application-level checkpointing framework and ULFM
Future exascale systems, formed by millions of cores, will present high failure rates, and long-running applications will need to make use of new fault tolerance techniques to ensure successful execution completion. The Fault Tolerance Working Group, ...
Reconfigurable Fault Tolerance: A Comprehensive Framework for Reliable and Adaptive FPGA-Based Space Computing
Commercial SRAM-based, field-programmable gate arrays (FPGAs) have the potential to provide space applications with the necessary performance to meet next-generation mission requirements. However, mitigating an FPGA’s susceptibility to single-event ...






Comments