skip to main content
research-article

Efficient Fine-grained Processor-logic Interactions on the Cache-coherent Zynq Platform

Published:09 January 2019Publication History
Skip Abstract Section

Abstract

The introduction of cache-coherent processor-logic interconnects in CPU-FPGA platforms promises low-latency communication between CPU and FPGA fabrics. This reduced latency improves the performance of heterogeneous systems implemented on such devices and gives rise to new software architectures that can better use the available hardware.

Via an extended study accelerating the software task scheduler of a microkernel operating system, this article reports on the potential for accelerating applications that exhibit fine-grained interactions. In doing so, we evaluate the performance of direct and cache-coherent communication methods for applications that involve frequent, low-bandwidth transactions between CPU and programmable logic.

In the specific case we studied, we found that replacing a highly optimised software implementation of the task scheduler with an FPGA-based scheduler reduces the cost of communication between two software threads by 5.5%. We also found that, while hardware acceleration reduces cache footprint, we still observe execution time variability because of other non-deterministic features of the CPU.

References

  1. ARM limited. 2005. ARMv7-A Architecture Reference Manual DDI 0406C.b.Google ScholarGoogle Scholar
  2. ARM limited. 2011. AMBA® AXI™ and ACE™ Protocol Specification IHI 0022D (ID102711).Google ScholarGoogle Scholar
  3. ARM limited. 2012. ARM Cortex-A9 MPCore Technical Reference Manual DDI0407H.Google ScholarGoogle Scholar
  4. B. Blackham, Yao Shi, S. Chattopadhyay, A. Roychoudhury, and Gernot Heiser. 2011. Timing analysis of a protected operating system kernel. In Proceedings of the IEEE 32nd Real-Time Systems Symposium (RTSS’11). 339--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Dahlstrom and S. Taylor. 2013. Migrating an OS scheduler into tightly coupled FPGA logic to increase attacker workload. In Proceedings of the IEEE Military Communications Conference (MILCOM’13). 986--991.Google ScholarGoogle Scholar
  6. E. Dodiu and V. G. Gaitan. 2012. Custom designed CPU architecture based on a hardware scheduler and independent pipeline registers—Concept and theory of operation. In Proceedings of the IEEE International Conference on Electro/Information Technology (EIT’12). 1--5.Google ScholarGoogle Scholar
  7. Muhuan Huang, K. Lim, and J. Cong. 2014. A scalable, high-performance customized priority queue. In Proceedings of the 24th International Conference on Field Programmable Logic and Applications (FPL’14). 1--4.Google ScholarGoogle Scholar
  8. A. Ioannou and M. G. H. Katevenis. 2007. Pipelined heap (priority queue) management for advanced scheduling in high-speed networks. IEEE/ACM Trans. Network. 15, 2 (April 2007), 450--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gerwin Klein, June Andronick, Kevin Elphinstone, Toby Murray, Thomas Sewell, Rafal Kolanski, and Gernot Heiser. 2014. Comprehensive formal verification of an OS microkernel. ACM Trans. Comput. Syst. 32, 1 (Feb. 2014), 2:1--2:70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Pramote Kuacharoen, Mohamed A. Shalan, and Vincent J. Mooney III. 2003. A configurable hardware scheduler for real-time systems. In Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms. CSREA Press, 96--101.Google ScholarGoogle Scholar
  11. Bo-Cheng Charles Lai, P. Schaumont, and I. Verbauwhede. 2005. A light-weight cooperative multi-threading with hardware supported thread-management on an embedded multi-processor system. In Proceedings of the 39th Asilomar Conference onSignals, Systems and Computers. 1647--1651.Google ScholarGoogle Scholar
  12. Anna Lyons and Gernot Heiser. 2016. It’s time: OS mechanisms for enforcing asymmetric temporal integrity. CoRR abs/1606.00111 (2016). Retrieved from http://arxiv.org/abs/1606.00111.Google ScholarGoogle Scholar
  13. V. J. Mooney and D. M. Blough. 2002. A hardware-software real-time operating system framework for SoCs. IEEE Design Test Comput. 19, 6 (Nov. 2002), 44--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. André C. Nácul, Francesco Regazzoni, and Marcello Lajolo. 2007. Hardware scheduling support in SMP architectures. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’07). EDA Consortium, San Jose, CA, 642--647. Retrieved from http://dl.acm.org/citation.cfm?id=1266366.1266502. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Soon Ee Ong, Siaw Chen Lee, N. B. Z. Ali, and F. A. B. Hussin. 2013. SEOS: Hardware implementation of real-time operating system for adaptability. In Proceedings of the 1st International Symposium on Computing and Networking (CANDAR’13). 612--616. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Sewell, F. Kam, and G. Heiser. 2016. Complete, high-assurance determination of loop bounds and infeasible paths for WCET analysis. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’16). 1--11.Google ScholarGoogle Scholar

Index Terms

  1. Efficient Fine-grained Processor-logic Interactions on the Cache-coherent Zynq Platform

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Reconfigurable Technology and Systems
        ACM Transactions on Reconfigurable Technology and Systems  Volume 11, Issue 4
        December 2018
        93 pages
        ISSN:1936-7406
        EISSN:1936-7414
        DOI:10.1145/3303942
        • Editor:
        • Steve Wilton
        Issue’s Table of Contents

        Copyright © 2019 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 9 January 2019
        • Accepted: 1 September 2018
        • Revised: 1 August 2018
        • Received: 1 January 2018
        Published in trets Volume 11, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!