Abstract
Advanced computations on embedded devices are nowadays a must in any application field. Often, to cope with such a need, embedded systems designers leverage on complex heterogeneous reconfigurable platforms that offer high performance, thanks to the possibility of specializing/customizing some computing elements on board, and are usually flexible enough to be optimized at runtime. In this context, monitoring the system has gained increasing interest. Ideally, monitoring systems should be non-intrusive, serve several purposes, and provide aggregated information about the behavior of the different system components. However, current literature is not close to such ideality: For example, existing monitoring systems lack in being applicable to modern heterogeneous platforms. This work presents a hardware monitoring system that is intended to be minimally invasive on system performance and resources, composable, and capable of providing to the user homogeneous observability and transparent access to the different components of a heterogeneous computing platform, so system metrics can be easily computed from the aggregation of the collected information. Building on a previous work, this article is primarily focused on the extension of an existing hardware monitoring system to cover also specialized coprocessing units, and the assessment is done on a Xilinx FPGA-based System on Programmable Chip. Different explorations are presented to explain the level of customizability of the proposed hardware monitoring system, the tradeoffs available to the user, and the benefits with respect to standard de facto monitoring support made available by the targeted FPGA vendor.
- Lattice Semiconductor, 2012-05. An FPGA “Companion” in Smartphone Design - White Paper. Document ID 47335.Google Scholar
- 2020. Jointer Open-source repository. Retrieved from https://github.com/alkalir/jointer.git.Google Scholar
- Zaid Al-Ars et al. 2019. The FitOptiVis ECSEL project: Highly efficient distributed embedded image/video processing in cyber-physical systems. In Conference on Computing Frontiers. 333–338.Google Scholar
- M. Aldham et al. 2011. Low-cost hardware profiling of run-time and energy in FPGA embedded processors. In Conference on Application-specific Systems, Architectures, and Processors. 61–68.Google Scholar
- Altera. 2013-11. Design Debugging Using the SignalTap II Logic Analyzer. Quartus II Handbook v.13.1. Vol. 3: Verification.Google Scholar
- ARM. 2013-08. White Paper: CoreSight Technical Introduction, A quickstart for designers. ARM-EPM-039795.Google Scholar
- ARM. 2020. AMBA AXI and ACE Protocol Specification AXI3, AXI4, and AXI4-Lite ACE and ACE-Lite. Retrieved from https://developer.arm.com/documentation/ihi0022/e/.Google Scholar
- D. Arora, S. Ravi, A. Raghunathan, and N. K. Jha. 2005. Secure embedded processing through hardware-assisted run-time monitoring. In Design, Automation and Test in Europe. 178–183 Vol. 1. DOI:DOI:https://doi.org/10.1109/DATE.2005.266Google Scholar
- Alexander Brant and Guy G. F. Lemieux. 2012. ZUMA: An open FPGA overlay architecture. In Symposium on Field-programmable Custom Computing Machines. IEEE Computer Society, 93–96.Google Scholar
- Andrew Canis et al. 2013. LegUp: An open-source high-level synthesis tool for FPGA-based processor/accelerator systems. ACM Trans. Embed. Comput. Syst. 13 (09 2013).Google Scholar
- Davide Zoni et al. 2018. PowerTap: All-digital power meter modeling for run-time power monitoring. Microproc. Microsyst. 63 (2018), 128–139.Google Scholar
Cross Ref
- E. R. Davies. 1984. Circularity â.” A new principle underlying the design of accurate edge orientation operators. Image Vis. Comput. 2, 3 (1984), 134–142.Google Scholar
Cross Ref
- N. C. Doyle et al. 2017. Performance impacts and limitations of hardware memory access trace collection. In Design, Automation Test in Europe Conference Exhibition (DATE’17), 2017. 506–511.Google Scholar
- Tiziana Fanni et al. 2018. Multi-grain reconfiguration for advanced adaptivity in cyber-physical systems. In Conference on ReConFigurable Computing and FPGAs. IEEE, 1–8.Google Scholar
- T. Fanni et al. 2019. Run-time performance monitoring of heterogenous Hw/Sw platforms using PAPI. In Workshop on FPGAs for Software Programmers. 1–10.Google Scholar
- J. Goeders and S. J. E. Wilton. 2017. Signal-tracing techniques for in-system FPGA debugging of high-level synthesis circuits. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 36, 1 (2017), 83–96.Google Scholar
Digital Library
- M. B. Hammouda et al. 2017. A unified design flow to automatically generate on-chip monitors during high-level synthesis of hardware accelerators. IEEE Trans. Comput.-aided Des. Integ. Circ. Syst. 36, 3 (2017), 384–397.Google Scholar
- Lizy Kurian John and Lieven Eeckhout. 2006. Performance Evaluation and Benchmarking. Taylor & Francis Group - CRC Press, Boca Raton.Google Scholar
- Georgios Kornaros and Dionisios Pnevmatikatos. 2013. A survey and taxonomy of on-chip monitoring of multicore systems-on-chip. ACM Trans. Des. Autom. Electron. Syst. 18, 2 (Apr. 2013).Google Scholar
Digital Library
- Andreas Kurth et al. 2017. HERO: Heterogeneous embedded research platform for exploring RISC-V manycore accelerators on FPGA. CoRR abs/1712.06497 (2017).Google Scholar
- Jong Chul Lee and Roman Lysecky. 2015. System-level observation framework for non-intrusive runtime monitoring of embedded systems. ACM Trans. Des. Autom. Electron. Syst. 20, 3 (June 2015).Google Scholar
Digital Library
- Xiangwei Li et al. 2018. FPGA overlays: Hardware-based computing for the masses. In Conference on Advances in Computing, Electronics and Electrical Technology.Google Scholar
- Xiangwei Li and Douglas L. Maskell. 2019. Time-multiplexed FPGA overlay architectures: A survey. ACM Trans. Des. Autom. Electr. Syst. 24, 5 (2019), 54:1–54:19.Google Scholar
- Daniel Madroñal and Tiziana Fanni. 2019. Run-time performance monitoring of hardware accelerators: POSTER. In Conference on Computing Frontiers. 289â.”291.Google Scholar
Digital Library
- E. Matthews et al. 2010. A configurable framework for investigating workload execution. In International Conference on Field-programmable Technology. 409–412.Google Scholar
- Hyun min Kyung et al. 2010. Design and implementation of Performance Analysis Unit (PAU) for AXI-based multi-core System on Chip (SOC). Microproc. Microsyst. 34, 2 (2010), 102–116.Google Scholar
Digital Library
- A. Moro et al. 2015. Hardware performance sniffers for embedded systems profiling. In Workshop on Intelligent Solutions in Embedded Systems. 29–34.Google Scholar
- V. Muttillo, G. Valente, L. Pomante, H. Posadas, J. Merino, and E. Villar. 2020. Run-time monitoring and trace analysis methodology for component-based embedded systems design flow. In 23rd Euromicro Conference on Digital System Design (DSD’20). 117–125. DOI:DOI:https://doi.org/10.1109/DSD51259.2020.00029Google Scholar
Cross Ref
- P. K. Nadimpalli and S. K. Roy. 2016. An efficient FPGA-based function profiler for embedded system applications. In Symposium on VLSI Design and Test. 1–6.Google Scholar
- M. Najem et al. 2017. A design-time method for building cost-effective run-time power monitoring. IEEE Trans. Comput.-aided Des. Integ. Circ. Syst. 36, 7 (2017), 1153–1166.Google Scholar
Digital Library
- Geoffrey Nelissen et al. 2015. A novel run-time monitoring architecture for safe and efficient inline monitoring. In Reliable Software Technologies – Ada-Europe 2015. Springer International Publishing, Cham, 66–82.Google Scholar
- Francesca Palumbo et al. 2019. Hardware/software self-adaptation in CPS: The CERBERO project approach. In Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation. Cham, 416–428.Google Scholar
- PAPI. 2020. Performance API. Retrieved from http://icl.utk.edu/papi/.Google Scholar
- G. Patrigeon et al. 2018. FPGA-based platform for fast accurate evaluation of ultra low power SoC. In Symposium on Power and Timing Modeling, Optimization and Simulation. 123–128.Google Scholar
- E. A. Rambo et al. 2019. The information processing factory: A paradigm for life cycle management of dependable systems. In Conference on Hardware/Software Codesign and System Synthesis. 1–10.Google Scholar
- Alfonso Rodríguez et al. 2018. FPGA-based high-performance embedded systems for adaptive edge computing in cyber-physical systems: The ARTICo framework. Sensors 18, 6 (2018), 1877.Google Scholar
Cross Ref
- Sadek, Ahmad et al. 2018. Supporting utilities for heterogeneous embedded image processing platforms (STHEM): An overview. In Applied Reconfigurable Computing. Architectures, Tools, and Applications. Springer International Publishing, Cham, 737–749.Google Scholar
- C. Sau et al. 2016. Automated design flow for multi-functional dataflow-based platforms. J. Sign. Process. Syst. 85, 1 (Oct. 2016), 143–165.Google Scholar
- T. Scheipel et al. 2017. System-aware performance monitoring unit for RISC-V architectures. In Conference on Digital System Design. 86–93.Google Scholar
- Minjun Seo and Fadi Kurdahi. 2019. Efficient tracing methodology using automata processor. ACM Trans. Embed. Comput. Syst. 18, 5s (Oct. 2019).Google Scholar
Digital Library
- Minjun Seo and Roman Lysecky. 2018. Non-intrusive in-situ requirements monitoring of embedded system. ACM Trans. Des. Autom. Electron. Syst. 23, 5 (2018), 1084–4309.Google Scholar
Digital Library
- Minjun Seo and Roman Lysecky. 2018. Work-in-progress: Runtime requirements monitoring for state-based hardware. In Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’18).Google Scholar
Digital Library
- Anuj Vaishnav et al. 2018. A survey on FPGA virtualization. In Conference on Field Programmable Logic and Applications. IEEE Computer Society, 131–138.Google Scholar
- G. Valente et al. 2016. A flexible profiling sub-system for reconfigurable logic architectures. In Conference on Parallel, Distributed, and Network-based Processing. 373–376.Google Scholar
- Xilinx. 2017-06-7. System Integrated Logic Analyzer v1.0, LogiCORE IP Product Guide, PG261.Google Scholar
- Xilinx. 2017-10-4. AXI Performance Monitor v5.0, LogiCORE IP Product Guide, PG037. https://www.xilinx.com/support/documentation/ip_documentation/system_ila/v1_0/pg261-system-ila.pdf.Google Scholar
- Xilinx. 2020. Zynq7000 SoC Technical Reference Manual. Retrieved from https://www.xilinx.com/support/documentation/ip_documentation/axi_perf_mon/v5_0/pg037_axi_perf_mon.pdf.Google Scholar
Index Terms
A Composable Monitoring System for Heterogeneous Embedded Platforms
Recommendations
Run-time performance monitoring of hardware accelerators: POSTER
CF '19: Proceedings of the 16th ACM International Conference on Computing FrontiersIn the era of Cyber Physical Systems, designers need to offer support for run-time adaptivity considering different constraints, including the internal status of the system. This work proposes a run-time monitoring approach for hardware accelerators, ...
Rapid Implementation of Embedded Systems using Xilinx Zynq Platform
SEEDA-CECNSM '16: Proceedings of the SouthEast European Design Automation, Computer Engineering, Computer Networks and Social Media ConferenceIn any digital system design, it is crucial to achieve the lowest time-to-market possible. Indeed, that need has pushed large FPGA manufacturers to produce SoCs which will implement reprogrammable logic along with CPU and DSP cores. Especially, during ...
FPGA implementation of a HW/SW platform for multimedia embedded systems
This paper presents a HW/SW platform for embedded video system. It has been designed around an embedded RISC processor and FPGA technologies and provides video input and output interfaces. The configurable platform has been used to implement a real time ...






Comments