Abstract
Newly emerging multiprocessor system-on-a-chip (MPSoC) platforms provide hard processing cores with programmable logic (PL) for high-performance computing applications. In this article, we take a deep look into these commercially available heterogeneous platforms and show how to design mixed-criticality applications such that different processing components can be isolated to avoid contention on the shared resources such as last-level cache and main memory.
Our approach involves software/hardware co-design to achieve isolation between the different criticality domains. At the hardware level, we use a scratchpad memory (SPM) with dedicated interfaces inside the PL to avoid conflicts in the main memory. At the software level, we employ a hypervisor to support cache-coloring such that conflicts at the shared L2 cache can be avoided. In order to move the tasks in/out of the SPM memory, we rely on a DMA engine and propose a new CPU-DMA co-scheduling policy, called Lazy Load, for which we also derive the response time analysis. The results of a case study on image processing demonstrate that the contention on the shared memory subsystem can be avoided when running with our proposed architecture. Moreover, comprehensive schedulability evaluations show that the newly proposed Lazy Load policy outperforms the existing CPU-DMA scheduling approaches and is effective in mitigating the main memory interference in our proposed architecture.
- [1] . 2014. Time-predictable execution of multithreaded applications on multicore systems. In 2014 Design, Automation Test in Europe Conference Exhibition (DATE’14). 1–6.
DOI: Google ScholarCross Ref
- [2] . 2018. Mixed-criticality scheduling with dynamic memory bandwidth regulation. In 2018 IEEE 24th International Conference on Embedded and Real-time Computing Systems and Applications (RTCSA’18). 111–117.
DOI: Google ScholarCross Ref
- [3] . 2016. Contention-free execution of automotive applications on a clustered many-core platform. In 2016 28th Euromicro Conference on Real-time Systems (ECRTS’16). 14–24.
DOI: Google ScholarCross Ref
- [4] . 2005. Measuring the performance of schedulability tests. Real-time Systems 30, 1–2 (2005), 129–154.
DOI: Google ScholarDigital Library
- [5] . 2012. Deterministic execution model on COTS hardware. In Architecture of Computing Systems (ARCS’12). Springer, 98–110.
DOI: Google ScholarDigital Library
- [6] . 2007. Worst-case response time analysis of real-time tasks under fixed-priority scheduling with deferred preemption revisited. In 19th Euromicro Conference on Real-time Systems (ECRTS’07). 269–279.
DOI: Google ScholarDigital Library
- [7] . 2015. A memory-centric approach to enable timing-predictability within embedded many-core accelerators. In 2015 CSI Symposium on Real-time and Embedded Systems and Technologies (RTEST’15). 1–8.
DOI: Google ScholarCross Ref
- [8] . 2017. A survey of research into mixed criticality systems. ACM Comput. Surv. 50, 6, Article
82 (Nov. 2017), 37 pages.DOI: Google ScholarDigital Library
- [9] . 2020. Predictable memory-CPU co-scheduling with support for latency-sensitive tasks. In 2020 57th ACM/IEEE Design Automation Conference (DAC’20). 1–6.
DOI: Google ScholarCross Ref
- [10] . 2020. Multi-core devices for safety-critical systems: A survey. ACM Comput. Surv. 53, 4, Article
79 (Aug. 2020), 38 pages.DOI: Google ScholarDigital Library
- [11] . 2007. Controller area network (CAN) schedulability analysis: Refuted, revisited and revised. Real-time Systems 35, 3 (
April 2007), 239–272.DOI: Google ScholarDigital Library
- [12] . 2014. Predictable flight management system implementation on a multicore processor. In Embedded Real-time Software (ERTS’14). https://hal.archives-ouvertes.fr/hal-01121700.Google Scholar
- [13] . 2020. Erika Enterprise RTOS v3. http://www.erika-enterprise.com/.Google Scholar
- [14] . 2018. HePREM: Enabling predictable GPU execution on heterogeneous SoC. In 2018 Design, Automation Test in Europe Conference Exhibition (DATE’18). 539–544.
DOI: Google ScholarCross Ref
- [15] . 2015. A survey on cache management mechanisms for real-time embedded systems. ACM Comput. Surv. 48, 2, Article
32 (Nov. 2015), 36 pages.DOI: Google ScholarDigital Library
- [16] . 2017. Two-phase colour-aware multicore real-time scheduler. IET Computers & Digital Techniques 11, 4 (
July 2017), 133–139(6). .Google ScholarCross Ref
- [17] . 2019. Designing mixed criticality applications on modern heterogeneous MPSoC platforms. In 31st Euromicro Conference on Real-time Systems (ECRTS’19)(
Leibniz International Proceedings in Informatics (LIPIcs) , Vol. 133), (Ed.). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 27:1–27:25.DOI: Google ScholarCross Ref
- [18] . 2019. Latency and Bandwidth Utilities. https://github.com/heechul/misc.Google Scholar
- [19] . 2021. A memory scheduling infrastructure for multi-core systems with re-programmable logic. In 33rd Euromicro Conference on Real-time Systems (ECRTS’21)(
Leibniz International Proceedings in Informatics (LIPIcs) , Vol. 196), (Ed.). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 2:1–2:22.DOI: Google ScholarCross Ref
- [20] . 2014. Bounding memory interference delay in COTS-based multi-core systems. In 2014 IEEE 19th Real-time and Embedded Technology and Applications Symposium (RTAS’14). 145–154.
DOI: Google ScholarCross Ref
- [21] . 2013. A coordinated approach for practical OS-level cache management in multi-core real-time systems. In 2013 25th Euromicro Conference on Real-time Systems. 80–89.
DOI: Google ScholarDigital Library
- [22] . 2017. Predictable shared cache management for multi-core real-time virtualization. ACM Trans. Embed. Comput. Syst. 17, 1, Article
22 (Dec. 2017), 27 pages.DOI: Google ScholarDigital Library
- [23] . 2016. Attacking the one-out-of-m multicore problem by combining hardware management with mixed-criticality provisioning. In 2016 IEEE Real-time and Embedded Technology and Applications Symposium (RTAS’16). 1–12.
DOI: Google ScholarCross Ref
- [24] . 2019. Deterministic memory hierarchy and virtualization for modern multi-core embedded systems. In 2019 IEEE Real-time and Embedded Technology and Applications Symposium (RTAS’19). 1–14.
DOI: Google ScholarCross Ref
- [25] . 1990. Fixed priority scheduling of periodic task sets with arbitrary deadlines. In Proceedings of the 11th Real-time Systems Symposium. 201–209.
DOI: Google ScholarCross Ref
- [26] . 1973. Scheduling algorithms for multiprogramming in a hard-real-time environment. J. ACM 20, 1 (
Jan. 1973), 46–61.DOI: Google ScholarDigital Library
- [27] . 2013. Open platform for mixed-criticality applications. In Proceedings of the Workshop on Industry-driven Approaches for Cost-effective Certification of Safety-critical, Mixed-criticality Systems (WICERT’13). 1–7. http://atcproyectos.ugr.es/wicert/downloads/wicert_papers/wicert2013_submission_8.pdf.Google Scholar
- [28] . 2013. Real-time cache management framework for multi-core architectures. In 2013 IEEE 19th Real-time and Embedded Technology and Applications Symposium (RTAS’13). 45–54.
DOI: Google ScholarDigital Library
- [29] . 2019. Combining PREM compilation and static scheduling for high-performance and predictable MPSoC execution. Parallel Comput. 85 (
2019), 27–44. DOI: Google ScholarDigital Library
- [30] . 2015. Memory-processor co-scheduling in fixed priority systems. In Proceedings of the 23rd International Conference on Real-time and Networks Systems (RTNS’15). Association for Computing Machinery, New York, NY,87–96.
DOI: Google ScholarDigital Library
- [31] . 2018. Supporting temporal and spatial isolation in a hypervisor for ARM multicore platforms. In 2018 IEEE International Conference on Industrial Technology (ICIT’18). 1651–1657.
DOI: Google ScholarCross Ref
- [32] . 2018. CHIPS-AHOy: A predictable holistic cyber-physical hypervisor for MPSoCs. In Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS’18). Association for Computing Machinery, New York, NY, 73–80.
DOI: Google ScholarDigital Library
- [33] . 2015. Embedded hypervisor xvisor: A comparative analysis. In 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-based Processing. 682–691.
DOI: Google ScholarDigital Library
- [34] . 2011. A predictable execution model for COTS-based embedded systems. In 2011 17th IEEE Real-time and Embedded Technology and Applications Symposium. 269–279.
DOI: Google ScholarDigital Library
- [35] . 2017. Look mum, no VM exits! (Almost). In Proceedings of the 13th Annual Workshop on Operating Systems Platforms for Embedded Real-time Applications (OSPERT’17). http://arxiv.org/abs/1705.06932.Google Scholar
- [36] . 2019. Implementation of memory centric scheduling for COTS multi-core real-time systems. In 31st Euromicro Conference on Real-time Systems (ECRTS’19)(
Leibniz International Proceedings in Informatics (LIPIcs) , Vol. 133), (Ed.). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 7:1–7:23.DOI: Google ScholarCross Ref
- [37] . 2022. CAESAR: Coherence-aided elective and seamless alternative routing via on-chip FPGA. In 2022 IEEE Real-time Systems Symposium (RTSS’22). 356–369.
DOI: Google ScholarCross Ref
- [38] . 2023. Relational memory: Native in-memory accesses on rows and columns. In 2023 International Conference on Extending Database Technology (EDBT’23).
DOI: Google ScholarCross Ref
- [39] . 2020. The potential of programmable logic in the middle: Cache bleaching. In 2020 IEEE Real-time and Embedded Technology and Applications Symposium (RTAS’20). 296–309.
DOI: Google ScholarCross Ref
- [40] . 2017. Tightening contention delays while scheduling parallel applications on multi-core architectures. ACM Trans. Embed. Comput. Syst. 16, 5s, Article
164 (Sept. 2017), 20 pages.DOI: Google ScholarDigital Library
- [41] . 2019. Hiding communication delays in contention-free execution for SPM-based multi-core architectures. In 31st Euromicro Conference on Real-time Systems (ECRTS’19)(
Leibniz International Proceedings in Informatics (LIPIcs) , Vol. 133), (Ed.). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 25:1–25:24.DOI: Google ScholarCross Ref
- [42] . 2020. Fixed-priority memory-centric scheduler for COTS-based multiprocessors. In 32nd Euromicro Conference on Real-time Systems (ECRTS’20)(
Leibniz International Proceedings in Informatics (LIPIcs) , Vol. 165), (Ed.). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 1:1–1:24.DOI: Google ScholarCross Ref
- [43] . 2020. E-WarP: A system-wide framework for memory bandwidth profiling and management. In 2020 IEEE Real-time Systems Symposium (RTSS’20). 345–357.
DOI: Google ScholarCross Ref
- [44] . 2022. Profile-driven memory bandwidth management for accelerators and CPUs in QoS-enabled platforms. Real-time Syst. 58, 3 (
Sept. 2022), 235–274.DOI: Google ScholarDigital Library
- [45] . 2019. Segment streaming for the three-phase execution model: Design and implementation. In 2019 IEEE Real-time Systems Symposium (RTSS’19). 260–273.
DOI: Google ScholarCross Ref
- [46] . 2019. PREM-based optimal task segmentation under fixed priority scheduling. In 31st Euromicro Conference on Real-time Systems (ECRTS’19)(
Leibniz International Proceedings in Informatics (LIPIcs) , Vol. 133), (Ed.). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 4:1–4:23.DOI: Google ScholarCross Ref
- [47] . 2020. KITTI. http://www.cvlibs.net/datasets/kitti/.Google Scholar
- [48] . 2016. A real-time scratchpad-centric os for multi-core embedded systems. In 2016 IEEE Real-time and Embedded Technology and Applications Symposium (RTAS’16). IEEE, 1–11.
DOI: Google ScholarCross Ref
- [49] . 2019. A real-time scratchpad-centric OS with predictable inter/intra-core communication for multi-core embedded systems. Real-time Systems 55, 4 (2019), 850–888.
DOI: Google ScholarDigital Library
- [50] . 2016. Taming non-blocking caches to improve isolation in multicore real-time systems. In 2016 IEEE Real-time and Embedded Technology and Applications Symposium (RTAS’16). 1–12.
DOI: Google ScholarCross Ref
- [51] . 2009. SD-VBS: The San Diego vision benchmark suite. In 2009 IEEE International Symposium on Workload Characterization (IISWC’09). 55–64.
DOI: Google ScholarDigital Library
- [52] . 2013. Outstanding paper award: Making shared caches more predictable on multicore platforms. In 2013 25th Euromicro Conference on Real-time Systems. 157–167.
DOI: Google ScholarDigital Library
- [53] . 2014. Hiding memory latency using fixed priority scheduling. In 2014 IEEE 19th Real-time and Embedded Technology and Applications Symposium (RTAS’14). 75–86.
DOI: Google ScholarCross Ref
- [54] . 2012. Explicit reservation of local memory in a predictable, preemptive multitasking real-time system. In 2012 IEEE 18th Real-time and Embedded Technology and Applications Symposium. 3–12.
DOI: Google ScholarDigital Library
- [55] . 2014. Explicit reservation of cache memory in a predictable, preemptive multitasking real-time system. ACM Trans. Embed. Comput. Syst. 13, 4s, Article
120 (Apr. 2014), 25 pages.DOI: Google ScholarDigital Library
- [56] . 2012. Investigation of scratchpad memory for preemptive multitasking. In 2012 IEEE 33rd Real-time Systems Symposium. 3–13.
DOI: Google ScholarDigital Library
- [57] . 2019. Zynq UltraScale+ Device - Technical Reference Manual. https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf.Google Scholar
- [58] . 2017. vCAT: Dynamic cache management using CAT virtualization. In 2017 IEEE Real-time and Embedded Technology and Applications Symposium (RTAS’17). 211–222.
DOI: Google ScholarCross Ref
- [59] . 2012. Memory-centric scheduling for multicore hard real-time systems. Real-time Systems 48, 6 (
Nov. 2012), 681–715.DOI: Google ScholarDigital Library
- [60] . 2016. Global real-time memory-centric scheduling for multicore systems. IEEE Trans. Comput. 65, 9 (2016), 2739–2751.
DOI: Google ScholarDigital Library
- [61] . 2016. MARACAS: A real-time multicore VCPU scheduling framework. In 2016 IEEE Real-time Systems Symposium (RTSS’16). 179–190.
DOI: Google ScholarCross Ref
- [62] . 2014. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In 2014 IEEE 19th Real-time and Embedded Technology and Applications Symposium (RTAS’14). 155–166.
DOI: Google ScholarCross Ref
- [63] . 2013. MemGuard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms. In 2013 IEEE 19th Real-time and Embedded Technology and Applications Symposium (RTAS’13). 55–64.
DOI: Google ScholarDigital Library
- [64] . 2018. Hypervisor-based multicore feedback control of mixed-criticality systems. IEEE Access 6, (2018), 50627–50640.
DOI: Google ScholarCross Ref
Index Terms
Lazy Load Scheduling for Mixed-criticality Applications in Heterogeneous MPSoCs
Recommendations
Generalizing fixed-priority scheduling for better schedulability in mixed-criticality systems
The design of mixed-criticality systems is often subject to mandatory certification and has been drawing considerable attention over the past few years. This letter studies fixed-priority scheduling of mixed-criticality systems on a uniprocessor ...
Dynamic scheduling algorithm and its schedulability analysis for certifiable dual-criticality systems
EMSOFT '11: Proceedings of the ninth ACM international conference on Embedded softwareReal-time embedded systems are becoming more complex to include multiple functionalities. Sharing a computing platform is a natural and effective solution to reducing the cost of those systems. However, the sharing can cause serious problems in mixed-...
Resource Synchronization and Preemption Thresholds Within Mixed-Criticality Scheduling
In a mixed-criticality system, multiple tasks with different levels of criticality may coexist on the same hardware platform. The scheduling algorithm EDF-VD (Earliest Deadline First with Virtual Deadlines) has been proposed for mixed-criticality ...






Comments