Abstract
With the increasing demand for higher performance, the adoption of multicores has been a major stepping stone in the evolution of hard real-time systems. Though the computational bandwidth is increased due to parallel processing, the indispensable interactivity between the hierarchical memory sub-system and multiple cores has further aggravated the already complex worst case execution time (WCET) analysis of tasks. Furthermore, caches have the biggest influence on task execution time, and the inclusion of shared caches further increases the unpredictability of the system. Cache partitioning techniques have been proposed as a counter-measure to decouple the shared cache latency from the WCET. However, existing energy-efficient scheduling algorithms are oblivious to the unpredictable nature of shared caches or cache partitioning techniques, thus, diminishing their applicability to real-world systems. Without considering inter-task cache contention, directly using existing algorithms or attempting to allocate and schedule a taskset with cache-partition assignments can result in cache violations. To overcome this dilemma, we propose a novel approach to model inter-task cache contention as a dependency graph to be used by well-established algorithms to minimize energy consumption. Extensive simulations demonstrate the effectiveness of our approach to minimize energy consumption while also avoiding cache violations.
- Zaid Al-bayati, Youcheng Sun, Haibo Zeng, Marco Di Natale, Qi Zhu, and Brett H. Meyer. 2019. Partitioning and selection of data consistency mechanisms for multicore real-time systems. ACM Trans. Embed. Comput. Syst. 18, 4, Article 35 (June 2019).Google Scholar
- Hakan Aydin and Qi Yang. 2003. Energy-aware partitioning for multiprocessor real-time systems. In Proceedings of the International Parallel and Distributed Processing Symposium.Google Scholar
Cross Ref
- Mario Bambagini, Mauro Marinoni, Hakan Aydin, and Giorgio Buttazzo. 2016. Energy-aware scheduling for real-time systems: A survey. ACM Trans. Embed. Comput. Syst. 15, 1, Article 7 (Jan. 2016).Google Scholar
Digital Library
- Sanjoy Baruah, Marko Bertogna, and Giorgio Buttazzo. 2015. Multiprocessor Scheduling for Real-Time Systems. Springer.Google Scholar
- Brice Berna and Isabelle Puaut. 2012. PDPA: Period driven task and cache partitioning algorithm for multi-core systems. In Proceedings of the 20th International Conference on Real-Time and Network Systems (RTNS’12). ACM, New York, NY, 181--189.Google Scholar
Digital Library
- Ashikahmed Bhuiyan, Zhishan Guo, Abusayeed Saifullah, Nan Guan, and Haoyi Xiong. 2018. Energy-efficient real-time scheduling of DAG tasks. ACM Trans. Embed. Comput. Syst. 17, 5 (2018), 84.Google Scholar
Digital Library
- Ashikahmed Bhuiyan, Sai Sruti, Zhishan Guo, and Kecheng Yang. 2019. Precise scheduling of mixed-criticality tasks by varying processor speed. In Proceedings of the 27th International Conference on Real-Time Networks and Systems (RTNS’19). Association for Computing Machinery, New York, NY, 123--132.Google Scholar
Digital Library
- Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. ACM, 72--81.Google Scholar
Digital Library
- Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti et al. 2011. The gem5 simulator. ACM SIGARCH Comput. Archit. News 39, 2 (2011), 1--7.Google Scholar
Digital Library
- Bach D. Bui, Marco Caccamo, Lui Sha, and Joseph Martinez. 2008. Impact of cache partitioning on multi-tasking real time embedded systems. In Proceedings of the 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications. 101--110.Google Scholar
Digital Library
- Gustavo A. Chaparro-Baquero, Soamar Homsi, Omara Vichot, Shaolei Ren, Gang Quan, and Shangping Ren. 2015. Cache allocation for fixed-priority real-time scheduling on multi-core platforms. In Proceedings of the 33rd IEEE International Conference on Computer Design. 589--596.Google Scholar
Digital Library
- Gang Chen, Biao Hu, Kai Huang, Alois Knoll, Di Liu, and Todor Stefanov. 2014. Automatic cache partitioning and time-triggered scheduling for real-time MPSoCs. In Proceedings of the International Conference on ReConFigurable Computing and FPGAs (ReConFig’14). 1--8.Google Scholar
Cross Ref
- Gang Chen, Kai Huang, Jia Huang, and Alois Knoll. 2013. Cache partitioning and scheduling for energy optimization of real-time MPSoCs. In Proceedings of the IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors. 35--41.Google Scholar
Digital Library
- Gang Chen, Kai Huang, and Alois Knoll. 2014. Energy optimization for real-time multiprocessor system-on-chip with optimal DVFS and DPM combination. ACM Trans. Embed. Comput. Syst. 13, 3s, Article 111 (Mar. 2014).Google Scholar
Digital Library
- Jian-Jia Chen, Heng-Ruey Hsu, Kai-Hsiang Chuang, Chia-Lin Yang, Ai-Chun Pang, and Tei-Wei Kuo. 2004. Multiprocessor energy-efficient scheduling with task migration considerations. In Proceedings of the 16th Euromicro Conference on Real-Time Systems (ECRTS’04). 101--108.Google Scholar
Digital Library
- Alexei Colin, Arvind Kandhalu, and Ragunathan (Raj) Rajkumar. 2016. Energy-efficient allocation of real-time applications onto single-ISA heterogeneous multi-core processors. J. Sig. Proc. Syst. 84, 1 (01 July 2016), 91--110.Google Scholar
- Robert I. Davis and Alan Burns. 2011. A survey of hard real-time scheduling for multiprocessor systems. ACM Comput. Surv. 43, 4, Article 35 (Oct. 2011).Google Scholar
Digital Library
- Xing Fu, Khairul Kabir, and Xiaorui Wang. 2011. Cache-aware utilization control for energy efficiency in multi-core real-time systems. In Proceedings of the 23rd Euromicro Conference on Real-Time Systems. 102--111.Google Scholar
Digital Library
- Pawel Gepner and Michal Filip Kowalik. 2006. Multi-core processors: New way to achieve high system performance. In Proceedings of the International Symposium on Parallel Computing in Electrical Engineering (PARELEC’06). 9--13.Google Scholar
Digital Library
- Marco E. T. Gerards, Johann L. Hurink, and Jan Kuper. 2015. On the interplay between global DVFS and scheduling tasks with precedence constraints. IEEE Trans. Comput. 64, 6 (June 2015), 1742--1754.Google Scholar
- Marco E. T. Gerards, Johann L. Hurink, and Philip K. F. Holzenspies. 2016. A survey of offline algorithms for energy minimization under deadline constraints. J. Sched. 19, 1 (01 Feb. 2016), 3--19.Google Scholar
Digital Library
- Giovani Gracioli, Ahmed Alhammad, Renato Mancuso, Antonio Augusto Frohlich, and Rodolfo Pellizzoni. 2015. A survey on cache management mechanisms for real-time embedded systems. ACM Comput. Surv. 48, 2, Article 32 (Nov. 2015).Google Scholar
- Nan Guan, Martin Stigge, Wang Yi, and Ge Yu. 2009. Cache-aware scheduling and analysis for multicores. In Proceedings of the 7th ACM International Conference on Embedded Software (EMSOFT’09). ACM, New York, NY, 245--254.Google Scholar
Digital Library
- Z. Guo, A. Bhuiyan, D. Liu, A. Khan, A. Saifullah, and N. Guan. 2019. Energy-efficient real-time scheduling of DAGs on clustered multi-core platforms. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’19). 156--168.Google Scholar
- Zhishan Guo, Kecheng Yang, and Fan Yao Amro Awad. 2020. Inter-task cache interference aware partitioned real-time scheduling. In Proceedings of the 35th Symposium on Applied Computing (SAC’20). Association for Computing Machinery.Google Scholar
Digital Library
- John L. Hennessy and David A. Patterson. 2011. Computer Architecture: A Quantitative Approach. Elsevier.Google Scholar
Digital Library
- Ravindra Jejurikar and Rajesh Gupta. 2004. Dynamic voltage scaling for systemwide energy minimization in real-time embedded systems. In Proceedings of the International Symposium on Low Power Electronics and Design. 78--81.Google Scholar
Digital Library
- Jaeyeon Kang and Sanjay Ranka. 2008. DVS based energy minimization algorithm for parallel machines. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing. 1--12.Google Scholar
Cross Ref
- Hyoseung Kim, Arvind Kandhalu, and Ragunathan Rajkumar. 2013. A coordinated approach for practical OS-level cache management in multi-core real-time systems. In Proceedings of the 25th Euromicro Conference on Real-Time Systems. 80--89.Google Scholar
Digital Library
- Hyoseung Kim and Ragunathan (Raj) Rajkumar. 2017. Predictable shared cache management for multi-core real-time virtualization. ACM Trans. Embed. Comput. Syst. 17, 1, Article 22 (Dec. 2017).Google Scholar
Digital Library
- Fanxin Kong, Yiqun Wang, Qingxu Deng, and Wang Yi. 2010. Minimizing multi-resource energy for real-time systems with discrete operation modes. In Proceedings of the 22nd Euromicro Conference on Real-Time Systems. 113--122.Google Scholar
Digital Library
- Markus Levy and Thomas M. Conte. 2009. Embedded multicore processors and systems. IEEE Micro 29, 3 (May 2009), 7--9.Google Scholar
Digital Library
- Keqin Li. 2012. Scheduling precedence constrained tasks with reduced processor energy on multiprocessor computers. IEEE Trans. Comput. 61, 12 (Dec. 2012), 1668--1681.Google Scholar
Digital Library
- Andrea Lodi, Silvano Martello, and Michele Monaci. 2002. Two-dimensional packing problems: A survey. Euro. J. Oper. Res. 141, 2 (2002), 241--252.Google Scholar
Cross Ref
- Jiong Luo and N. K. Jha. 2007. Power-efficient scheduling for heterogeneous distributed real-time embedded systems. Trans. Comp.-Aided Des. Integ. Cir. Sys. 26, 6 (June 2007), 1161--1170.Google Scholar
- Mingsong Lv, Nan Guan, Jan Reineke, Reinhard Wilhelm, and Wang Yi. 2016. A survey on static cache analysis for real-time systems. Leibniz Trans. Embed. Syst. 3, 1 (2016), 05--1--05:48.Google Scholar
- Amjad Mahmood, Salman A. Khan, Fawzi Albalooshi, and Noor Awwad. 2017. Energy-aware real-time task scheduling in multiprocessor systems using a hybrid genetic algorithm. Electronics 6, 2 (2017).Google Scholar
- José Luis March, Julio Sahuquillo, Salvador Petit, Houcine Hassan, and José Duato. 2013. Power-aware scheduling with effective task migration for real-time multicore embedded systems. Concur. Comput.: Pract. Exper. 25, 14 (2013), 1987--2001.Google Scholar
Cross Ref
- Alessandra Melani, Marko Bertogna, Robert I. Davis, Vincenzo Bonifaci, Alberto Marchetti-Spaccamela, and Giorgio Buttazzo. 2017. Exact response time analysis for fixed priority memory-processor co-scheduling. IEEE Trans. Comput. 66, 4 (Apr. 2017), 631--646.Google Scholar
Digital Library
- Sparsh Mittal. 2014. A survey of architectural techniques for improving cache power efficiency. Sustain. Comput.: Inf. Syst. 4, 1 (2014), 33--43.Google Scholar
Cross Ref
- Takashi Nakada. 2017. Low-Power Circuit Technologies. Springer Japan, Tokyo, 11--25.Google Scholar
- Sujay Narayana, Pengcheng Huang, Georgia Giannopoulou, Lothar Thiele, and R. Venkatesha Prasad. 2016. Exploring energy saving for mixed-criticality systems on multi-cores. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’16). IEEE, 1--12.Google Scholar
- Santiago Pagani, Jian-Jia Chen, and Minming Li. 2015. Energy efficiency on multi-core architectures with multiple voltage islands. IEEE Trans. Parallel Distrib. Syst. 26, 6 (June 2015), 1608--1621.Google Scholar
Digital Library
- Santiago Pagani, Anuj Pathania, Muhammad Shafique, Jian-Jia Chen, and Jörg Henkel. 2017. Energy efficiency for clustered heterogeneous multicores. IEEE Trans. Parallel Distrib. Syst. 28, 5 (May 2017), 1315--1330.Google Scholar
Digital Library
- Shrinivas Anand Panchamukhi and Frank Mueller. 2015. Providing task isolation via TLB coloring. In Proceedings of the 21st IEEE Real-Time and Embedded Technology and Applications Symposium. 3--13.Google Scholar
Cross Ref
- Marco Paolieri, Eduardo Quiñones, Francisco J. Cazorla, Robert I. Davis, and Mateo Valero. 2011. IA3: An interference aware allocation algorithm for multicore hard real-time systems. In Proceedings of the 17th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’11). IEEE, 280--290.Google Scholar
- Paul J. Parkinson. 2016. Applying MILS to multicore avionics systems. In Proceedings of the [email protected].Google Scholar
- Marcus T. Schmitz and Bashir M. Al-Hashimi. 2001. Considering power variations of DVS processing elements for energy minimisation in distributed systems. In Proceedings of the 14th International Symposium on Systems Synthesis (ISSS’01). ACM, New York, NY, 250--255.Google Scholar
- Euiseong Seo, Jinkyu Jeong, Seonyeong Park, and Joonwon Lee. 2008. Energy efficient scheduling of real-time tasks on multicore processors. IEEE Trans. Parallel Distrib. Syst. 19, 11 (Nov. 2008), 1540--1552.Google Scholar
- Shaoxiong Hua and Gang Qu. 2005. Power minimization techniques on distributed real-time systems by global and local slack management. In Proceedings of the Asia and South Pacific Design Automation Conference, Vol. 2. ACM, New York, NY, 830--835.Google Scholar
Digital Library
- Saad Zia Sheikh and Muhammad Adeel Pasha. 2018. Energy-efficient multicore scheduling for hard real-time systems: A survey. ACM Trans. Embed. Comput. Syst. 17, 6, Article 94 (2018).Google Scholar
Digital Library
- Saad Zia Sheikh and Muhammad Adeel Pasha. 2019. An improved model for system-level energy minimization on real-time systems. In Proceedings of the IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’19). 276--282.Google Scholar
Cross Ref
- Yudong Tan and Vincent Mooney. 2004. Integrated intra-and inter-task cache analysis for preemptive multi-tasking real-time systems. In Proceedings of the International Workshop on Software and Compilers for Embedded Systems. Springer, 182--199.Google Scholar
Cross Ref
- Shyamkumar Thoziyoor, Naveen Muralimanohar, Jung Ho Ahn, and Norman P. Jouppi. 2008. CACTI 5.1. Technical Report. Technical Report HPL-2008-20, HP Labs.Google Scholar
- P. K. Valsan, H. Yun, and F. Farshchi. 2016. Taming non-blocking caches to improve isolation in multicore real-time systems. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’16). 1--12.Google Scholar
- Weixun Wang and Prabhat Mishra. 2010. Leakage-aware energy minimization using dynamic voltage scaling and cache reconfiguration in real-time systems. In Proceedings of the 23rd International Conference on VLSI Design. 357--362.Google Scholar
Digital Library
- Weixun Wang and Prabhat Mishra. 2012. System-wide leakage-aware energy minimization using dynamic voltage scaling and cache reconfiguration in multitasking systems. IEEE Trans. Very Large Scale Integ. (VLSI) Syst. 20, 5 (May 2012), 902--910.Google Scholar
- Weixun Wang, Prabhat Mishra, and Ranka Sanjay. 2011. Dynamic cache reconfiguration and partitioning for energy optimization in real-time multi-core systems. In Proceedings of the 48th ACM/IEEE Design Automation Conference (DAC’11). 948--953.Google Scholar
Digital Library
- Changjiu Xian, Yung-Hsiang Lu, and Zhiyuan Li. 2007. Energy-aware scheduling for real-time multiprocessor systems with uncertain task execution time. In Proceedings of the 44th ACM/IEEE Design Automation Conference. 664--669.Google Scholar
- Meng Xu, Robert Gifford, and Linh Thi Xuan Phan. 2019. Holistic multi-resource allocation for multicore real-time virtualization. In Proceedings of the 56th Annual Design Automation Conference (DAC’19). ACM, New York, NY.Google Scholar
Digital Library
- Meng Xu, Linh Thi Xuan Phan, Hyon-Young Choi, and Insup Lee. 2016. Analysis and implementation of global preemptive fixed-priority scheduling with dynamic cache allocation. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’16). 1--12.Google Scholar
Cross Ref
- Meng Xu, Linh Thi Xuan Phan, Hyon-Young Choi, Yuhan Lin, Haoran Li, Chenyang Lu, and Insup Lee. 2019. Holistic resource allocation for multicore real-time systems. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’19). 345--356.Google Scholar
Cross Ref
- Heechul Yun, Renato Mancuso, Zheng-Pei Wu, and Rodolfo Pellizzoni. 2014. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In Proceedings of the IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS’14). 155--166.Google Scholar
Cross Ref
- Heechul Yun, Po-Liang Wu, Anshu Arya, Cheolgi Kim, Tarek Abdelzaher, and Lui Sha. 2011. System-wide energy optimization for multiple DVS components and real-time tasks. Real-Time Syst. 47, 5 (07 May 2011), 489.Google Scholar
- Chuanjun Zhang, Frank Vahid, and Walid Najjar. 2005. A highly configurable cache for low energy embedded systems. ACM Trans. Embed. Comput. Syst. 4, 2 (May 2005), 363--387.Google Scholar
Digital Library
- Yumin Zhang, Xiaobo Hu, and Danny Z. Chen. 2002. Task scheduling and voltage selection for energy minimization. In Proceedings of the Design Automation Conference (IEEE Cat. No. 02CH37324). 183--188.Google Scholar
- Xiliang Zhong and Cheng-Zhong Xu. 2008. System-wide energy minimization for real-time tasks: Lower bound and approximation. ACM Trans. Embed. Comput. Syst. 7, 3 (2008), 28.Google Scholar
Digital Library
- Junlong Zhou, Jianming Yan, Tongquan Wei, Mingsong Chen, and Xiaobo Sharon Hu. 2017. Energy-adaptive scheduling of imprecise computation tasks for QoS optimization in real-time MPSoC systems. In IEEE Design, Automation and Test in Europe Conference. 1402--1407.Google Scholar
Index Terms
Energy-efficient Real-time Scheduling on Multicores: A Novel Approach to Model Cache Contention
Recommendations
Energy-Efficient Multicore Scheduling for Hard Real-Time Systems: A Survey
As real-time embedded systems are evolving in scale and complexity, the demand for a higher performance at a minimum energy consumption has become a necessity. Consequently, many embedded systems are now adopting multicore architectures into their ...
Integration of Cache Partitioning and Preemption Threshold Scheduling to Improve Schedulability of Hard Real-Time Systems
ECRTS '15: Proceedings of the 2015 27th Euromicro Conference on Real-Time SystemsFor preemptive scheduling with shared cache, different tasks may cause interference in the shared cache, leading to Cache-Related Preemption Overhead (CRPD). Cache partitioning is a well-known technique for mitigating unpredictable cache interference in ...
Energy-Efficient GPU L2 Cache Design Using Instruction-Level Data Locality Similarity
This article presents a novel energy-efficient cache design for massively parallel, throughput-oriented architectures like GPUs. Unlike L1 data cache on modern GPUs, L2 cache shared by all of the streaming multiprocessors is not the primary performance ...






Comments