Abstract
Saving and restoring an FPGA task state in an orderly manner is essential to enable hardware checkpointing, which is highly desirable to improve the ability to debug cloud-scale hardware services, and context switching, which allows multiple users to share FPGA resources. However, these features require task interruption, and stopping a task at an arbitrary time can cause several hazards including deadlock and data loss. In this article, we build a context saving and restoring simulator to simulate and identify these hazards. In addition, we derive design rules that should be followed to achieve safe task interruption. Finally, we propose task wrappers that can be placed around an FPGA task to implement these rules. The timing and area overheads added by these wrappers are very small; they add 1.8% area and no timing overhead to a full Memcached system. Taken together, these design rules and wrappers enable safe checkpointing and context switching in a wide variety of FPGA tasks, including those with multiple clocks, multi-cycle I/O transactions, and interface dependencies.
- Sameh Attia and Vaughn Betz. 2019. Safe task interruption for FPGAs. In Proceedings of the International Symposium on Field-Programmable Custom Computing Machines (FCCM’19). 329--329.Google Scholar
Cross Ref
- Alban Bourge, Olivier Muller, and Frédéric Rousseau. 2016. Generating efficient context-switch capable circuits through autonomous design flow. ACM Trans. Reconfig. Technol. Syst. 10, 1 (2016), 9:1--9:23.Google Scholar
Digital Library
- Jongsok Choi, Ruolong Lian, Zhi Li, Andrew Canis, and Jason Anderson. 2018. Accelerating Memcached on AWS cloud FPGAs. In Proceedings of the International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies (HEART’18). 2:1--2:8.Google Scholar
Digital Library
- Dormando. 2018. Memcached—A Distributed Memory Object Caching System. Retrieved from http://memcached.org/.Google Scholar
- Alibaba Cloud ECS. 2018. Deep Dive into Alibaba Cloud F3 FPGA as a Service Instances. Retrieved from https://www.alibabacloud.com/blog/deep-dive-into-alibaba-cloud-f3-fpga-as-a-service-instances_594057.Google Scholar
- L. Gong and O. Diessel. 2011. ReSim: A reusable library for RTL simulation of dynamic partial reconfiguration. In Proceedings of the International Conference on Field-Programmable Technology (FPT’11). 1--8.Google Scholar
- Lingkan Gong and Oliver Diessel. 2012. Functionally verifying state saving and restoration in dynamically reconfigurable systems. In Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA’12). 241--244.Google Scholar
Digital Library
- Markus Happe, Andreas Traber, and Ariane Keller. 2015. Preemptive hardware multitasking in ReconOS. In Proceedings of the International Symposium on Applied Reconfigurable Computing (ARC’15).Google Scholar
Cross Ref
- C. Huang, K. Shih, C. Lin, S. Chang, and P. Hsiung. 2007. Dynamically swappable hardware design in partially reconfigurable systems. In Proceedings of the International Symposium on Circuits and Systems (ISCAS’07). 2742--2745.Google Scholar
- Intel, Inc. 2019. Quartus Prime Pro Edition User Guide: Partial Reconfiguration. Intel, Inc.Google Scholar
- Yousef Iskander, Cameron Patterson, and Stephen Craven. 2014. High-level abstractions and modular debugging for FPGA design validation. ACM Trans. Reconfig. Technol. 7, 1 (2014), 2:1--2:22.Google Scholar
- H. Kalte and M. Porrmann. 2005. Context saving and restoring for multitasking in reconfigurable systems. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’05). 223--228.Google Scholar
- Oliver Knodel, Paul R. Genssler, and Rainer G. Spallek. 2017. Migration of long-running tasks between reconfigurable resources using virtualization. SIGARCH Comput. Architect. News 44, 4 (2017), 56--61.Google Scholar
Digital Library
- Dirk Koch, Christian Haubelt, and Jürgen Teich. 2007. Efficient hardware checkpointing: Concepts, overhead analysis, and implementation. In Proceedings of the International Symposium on Field Programmable Gate Arrays (FPGA’07). 188--196.Google Scholar
Digital Library
- Daniel Ly-Ma. 2019. Live Migration of FPGA Applications. Master’s thesis. University of Toronto.Google Scholar
- Robert O’Callahan, Chris Jones, Nathan Froyd, Kyle Huey, Albert Noll, and Nimrod Partush. 2017. Engineering record and replay for deployability. In Proceedings of the USENIX Annual Technical Conference. 377--389.Google Scholar
- Khoa Dang Pham, Edson Horta, and Dirk Koch. 2017. BITMAN: A tool and API for FPGA bitstream manipulations. In Proceedings of the Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE’17). 894--897.Google Scholar
- Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, Michael Haselman, Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim, Sitaram Lanka, James Larus, Eric Peterson, Simon Pope, Aaron Smith, Jason Thong, Phillip Yi Xiao, and Doug Burger. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. In Proceedings of the International Symposium on Computer Architecture (ISCA’14). 13--24.Google Scholar
Digital Library
- Amazon Web Services. 2017. Amazon EC2 F1 Instances. Retrieved from https://aws.amazon.com/ec2/instance-types/f1/.Google Scholar
- H. Simmler, L. Levinson, and Reinhard Männer. 2000. Multitasking on FPGA coprocessors. In Proceedings of the International Workshop on Field-Programmable Logic and Applications (FPL’00). 121--130.Google Scholar
Cross Ref
- Stuart Sutherland. 2002. The Verilog PLI Handbook. Kluwer Academic Publishers.Google Scholar
- Stephanie Tapp. 2015. XAPP1230: Configuration Readback Capture in UltraScale FPGAs. Xilinx, Inc.Google Scholar
- Reed P. Tidwell. 2005. Alpha Blending Two Data Streams Using a DSP48 DDR Technique. Xilinx, Inc.Google Scholar
- Anuj Vaishnav, Khoa Pham, and Dirk Koch. 2018. Live migration for OpenCL FPGA accelerators. In Proceedings of the International Conference on Field-Programmable Technology (FPT’18).Google Scholar
Cross Ref
- Anuj Vaishnav, Khoa Pham, and Dirk Koch. 2018. A survey on FPGA virtualization. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’18).Google Scholar
Cross Ref
- Anuj Vaishnav, Khoa Pham, Dirk Koch, and James Garside. 2018. Resource elastic virtualization for FPGAs using OpenCL. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’18).Google Scholar
Cross Ref
- T. Xia, J. Prévotet, and F. Nouvel. 2016. Hypervisor mechanisms to manage FPGA reconfigurable accelerators. In Proceedings of the International Conference on Field-Programmable Technology (FPT’16). 44--52.Google Scholar
- Xilinx. 2016. HLS implementation of Memcached pipeline. Retrieved from https://github.com/Xilinx/HLx_Examples/tree/master/Acceleration/memcached.Google Scholar
- Xilinx, Inc. 2016. PG227: Partial Reconfiguration Decoupler. Xilinx, Inc.Google Scholar
- Xilinx, Inc. 2017. PG022: AXI DataMover v5.1. Xilinx, Inc.Google Scholar
- Xilinx, Inc. 2018. PG193: Partial Reconfiguration Controller. Xilinx, Inc.Google Scholar
- Xilinx, Inc. 2018. PG305: Partial Reconfiguration AXI Shutdown Manager. Xilinx, Inc.Google Scholar
- Xilinx, Inc. 2018. Vivado Design Suite User Guide: Partial Reconfiguration. Xilinx, Inc.Google Scholar
- S. Yazdanshenas and V. Betz. 2017. Quantifying and mitigating the costs of FPGA virtualization. In Proceedings of the International Conference on Field-Programmable Logic and Applications (FPL’17). 1--7.Google Scholar
- Sadegh Yazdanshenas and Vaughn Betz. 2018. Improving confidentiality in virtualized FPGAs. In Proceedings of the International Conference on Field-Programmable Technology (FPT’18). 258--261.Google Scholar
Cross Ref
Index Terms
Feel Free to Interrupt: Safe Task Stopping to Enable FPGA Checkpointing and Context Switching
Recommendations
StateMover: Combining Simulation and Hardware Execution for Efficient FPGA Debugging
FPGA '20: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysDebugging consumes a large portion of FPGA design time, and with the growing complexity of traditional FPGA systems and the additional verification challenges posed by multiple FPGAs interacting within data centers, debugging productivity is becoming ...
Context Switching in a Run-Time Reconfigurable System
A distinguishing feature of reconfigurable computing over rapid prototyping is its ability to configure the computational fabric on-line while an application is running. Conventional reconfigurable computing platforms utilize commodity FPGAs, which ...
Toward Software-like Debugging for FPGAs via Checkpointing and Transaction-based Co-Simulation
Checkpoint-based debugging flows have recently been developed that allow the user to move the design state back and forth between an FPGA and a simulator. They provide a softwarelike debugging experience by combining the speed of hardware execution and ...






Comments