skip to main content
research-article

Feel Free to Interrupt: Safe Task Stopping to Enable FPGA Checkpointing and Context Switching

Published:28 January 2020Publication History
Skip Abstract Section

Abstract

Saving and restoring an FPGA task state in an orderly manner is essential to enable hardware checkpointing, which is highly desirable to improve the ability to debug cloud-scale hardware services, and context switching, which allows multiple users to share FPGA resources. However, these features require task interruption, and stopping a task at an arbitrary time can cause several hazards including deadlock and data loss. In this article, we build a context saving and restoring simulator to simulate and identify these hazards. In addition, we derive design rules that should be followed to achieve safe task interruption. Finally, we propose task wrappers that can be placed around an FPGA task to implement these rules. The timing and area overheads added by these wrappers are very small; they add 1.8% area and no timing overhead to a full Memcached system. Taken together, these design rules and wrappers enable safe checkpointing and context switching in a wide variety of FPGA tasks, including those with multiple clocks, multi-cycle I/O transactions, and interface dependencies.

References

  1. Sameh Attia and Vaughn Betz. 2019. Safe task interruption for FPGAs. In Proceedings of the International Symposium on Field-Programmable Custom Computing Machines (FCCM’19). 329--329.Google ScholarGoogle ScholarCross RefCross Ref
  2. Alban Bourge, Olivier Muller, and Frédéric Rousseau. 2016. Generating efficient context-switch capable circuits through autonomous design flow. ACM Trans. Reconfig. Technol. Syst. 10, 1 (2016), 9:1--9:23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jongsok Choi, Ruolong Lian, Zhi Li, Andrew Canis, and Jason Anderson. 2018. Accelerating Memcached on AWS cloud FPGAs. In Proceedings of the International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies (HEART’18). 2:1--2:8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Dormando. 2018. Memcached—A Distributed Memory Object Caching System. Retrieved from http://memcached.org/.Google ScholarGoogle Scholar
  5. Alibaba Cloud ECS. 2018. Deep Dive into Alibaba Cloud F3 FPGA as a Service Instances. Retrieved from https://www.alibabacloud.com/blog/deep-dive-into-alibaba-cloud-f3-fpga-as-a-service-instances_594057.Google ScholarGoogle Scholar
  6. L. Gong and O. Diessel. 2011. ReSim: A reusable library for RTL simulation of dynamic partial reconfiguration. In Proceedings of the International Conference on Field-Programmable Technology (FPT’11). 1--8.Google ScholarGoogle Scholar
  7. Lingkan Gong and Oliver Diessel. 2012. Functionally verifying state saving and restoration in dynamically reconfigurable systems. In Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA’12). 241--244.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Markus Happe, Andreas Traber, and Ariane Keller. 2015. Preemptive hardware multitasking in ReconOS. In Proceedings of the International Symposium on Applied Reconfigurable Computing (ARC’15).Google ScholarGoogle ScholarCross RefCross Ref
  9. C. Huang, K. Shih, C. Lin, S. Chang, and P. Hsiung. 2007. Dynamically swappable hardware design in partially reconfigurable systems. In Proceedings of the International Symposium on Circuits and Systems (ISCAS’07). 2742--2745.Google ScholarGoogle Scholar
  10. Intel, Inc. 2019. Quartus Prime Pro Edition User Guide: Partial Reconfiguration. Intel, Inc.Google ScholarGoogle Scholar
  11. Yousef Iskander, Cameron Patterson, and Stephen Craven. 2014. High-level abstractions and modular debugging for FPGA design validation. ACM Trans. Reconfig. Technol. 7, 1 (2014), 2:1--2:22.Google ScholarGoogle Scholar
  12. H. Kalte and M. Porrmann. 2005. Context saving and restoring for multitasking in reconfigurable systems. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’05). 223--228.Google ScholarGoogle Scholar
  13. Oliver Knodel, Paul R. Genssler, and Rainer G. Spallek. 2017. Migration of long-running tasks between reconfigurable resources using virtualization. SIGARCH Comput. Architect. News 44, 4 (2017), 56--61.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dirk Koch, Christian Haubelt, and Jürgen Teich. 2007. Efficient hardware checkpointing: Concepts, overhead analysis, and implementation. In Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA’07). 188--196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Daniel Ly-Ma. 2019. Live Migration of FPGA Applications. Master’s thesis. University of Toronto.Google ScholarGoogle Scholar
  16. Robert O’Callahan, Chris Jones, Nathan Froyd, Kyle Huey, Albert Noll, and Nimrod Partush. 2017. Engineering record and replay for deployability. In Proceedings of the USENIX Annual Technical Conference. 377--389.Google ScholarGoogle Scholar
  17. Khoa Dang Pham, Edson Horta, and Dirk Koch. 2017. BITMAN: A tool and API for FPGA bitstream manipulations. In Proceedings of the Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE’17). 894--897.Google ScholarGoogle Scholar
  18. Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, Michael Haselman, Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim, Sitaram Lanka, James Larus, Eric Peterson, Simon Pope, Aaron Smith, Jason Thong, Phillip Yi Xiao, and Doug Burger. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. In Proceedings of the International Symposium on Computer Architecture (ISCA’14). 13--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Amazon Web Services. 2017. Amazon EC2 F1 Instances. Retrieved from https://aws.amazon.com/ec2/instance-types/f1/.Google ScholarGoogle Scholar
  20. H. Simmler, L. Levinson, and Reinhard Männer. 2000. Multitasking on FPGA coprocessors. In Proceedings of the International Workshop on Field-Programmable Logic and Applications (FPL’00). 121--130.Google ScholarGoogle ScholarCross RefCross Ref
  21. Stuart Sutherland. 2002. The Verilog PLI Handbook. Kluwer Academic Publishers.Google ScholarGoogle Scholar
  22. Stephanie Tapp. 2015. XAPP1230: Configuration Readback Capture in UltraScale FPGAs. Xilinx, Inc.Google ScholarGoogle Scholar
  23. Reed P. Tidwell. 2005. Alpha Blending Two Data Streams Using a DSP48 DDR Technique. Xilinx, Inc.Google ScholarGoogle Scholar
  24. Anuj Vaishnav, Khoa Pham, and Dirk Koch. 2018. Live migration for OpenCL FPGA accelerators. In Proceedings of the International Conference on Field-Programmable Technology (FPT’18).Google ScholarGoogle ScholarCross RefCross Ref
  25. Anuj Vaishnav, Khoa Pham, and Dirk Koch. 2018. A survey on FPGA virtualization. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’18).Google ScholarGoogle ScholarCross RefCross Ref
  26. Anuj Vaishnav, Khoa Pham, Dirk Koch, and James Garside. 2018. Resource elastic virtualization for FPGAs using OpenCL. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’18).Google ScholarGoogle ScholarCross RefCross Ref
  27. T. Xia, J. Prévotet, and F. Nouvel. 2016. Hypervisor mechanisms to manage FPGA reconfigurable accelerators. In Proceedings of the International Conference on Field-Programmable Technology (FPT’16). 44--52.Google ScholarGoogle Scholar
  28. Xilinx. 2016. HLS implementation of Memcached pipeline. Retrieved from https://github.com/Xilinx/HLx_Examples/tree/master/Acceleration/memcached.Google ScholarGoogle Scholar
  29. Xilinx, Inc. 2016. PG227: Partial Reconfiguration Decoupler. Xilinx, Inc.Google ScholarGoogle Scholar
  30. Xilinx, Inc. 2017. PG022: AXI DataMover v5.1. Xilinx, Inc.Google ScholarGoogle Scholar
  31. Xilinx, Inc. 2018. PG193: Partial Reconfiguration Controller. Xilinx, Inc.Google ScholarGoogle Scholar
  32. Xilinx, Inc. 2018. PG305: Partial Reconfiguration AXI Shutdown Manager. Xilinx, Inc.Google ScholarGoogle Scholar
  33. Xilinx, Inc. 2018. Vivado Design Suite User Guide: Partial Reconfiguration. Xilinx, Inc.Google ScholarGoogle Scholar
  34. S. Yazdanshenas and V. Betz. 2017. Quantifying and mitigating the costs of FPGA virtualization. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’17). 1--7.Google ScholarGoogle Scholar
  35. Sadegh Yazdanshenas and Vaughn Betz. 2018. Improving confidentiality in virtualized FPGAs. In Proceedings of the International Conference on Field-Programmable Technology (FPT’18). 258--261.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Feel Free to Interrupt: Safe Task Stopping to Enable FPGA Checkpointing and Context Switching

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Reconfigurable Technology and Systems
      ACM Transactions on Reconfigurable Technology and Systems  Volume 13, Issue 1
      March 2020
      135 pages
      ISSN:1936-7406
      EISSN:1936-7414
      DOI:10.1145/3377289
      • Editor:
      • Deming Chen
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 January 2020
      • Accepted: 1 November 2019
      • Revised: 1 October 2019
      • Received: 1 July 2019
      Published in trets Volume 13, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!