skip to main content
research-article

Holistic Resource Allocation Under Federated Scheduling for Parallel Real-time Tasks

Authors Info & Claims
Published:14 January 2022Publication History
Skip Abstract Section

Abstract

With the technology trend of hardware and workload consolidation for embedded systems and the rapid development of edge computing, there has been increasing interest in supporting parallel real-time tasks to better utilize the multi-core platforms while meeting the stringent real-time constraints. For parallel real-time tasks, the federated scheduling paradigm, which assigns each parallel task a set of dedicated cores, achieves good theoretical bounds by ensuring exclusive use of processing resources to reduce interferences. However, because cores share the last-level cache and memory bandwidth resources, in practice tasks may still interfere with each other despite executing on dedicated cores. Such resource interferences due to concurrent accesses can be even more severe for embedded platforms or edge servers, where the computing power and cache/memory space are limited. To tackle this issue, in this work, we present a holistic resource allocation framework for parallel real-time tasks under federated scheduling. Under our proposed framework, in addition to dedicated cores, each parallel task is also assigned with dedicated cache and memory bandwidth resources. Further, we propose a holistic resource allocation algorithm that well balances the allocation between different resources to achieve good schedulability. Additionally, we provide a full implementation of our framework by extending the federated scheduling system with Intel’s Cache Allocation Technology and MemGuard. Finally, we demonstrate the practicality of our proposed framework via extensive numerical evaluations and empirical experiments using real benchmark programs.

REFERENCES

  1. [1] Agrawal Ankit, Fohler Gerhard, Freitag Johannes, Nowotsch Jan, Uhrig Sascha, and Paulitsch Michael. 2017. Contention-aware dynamic memory bandwidth isolation with predictability in COTS multicores: An avionics case study. In Euromicro Conference on Real-Time Systems (ECRTS). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.Google ScholarGoogle Scholar
  2. [2] Alhammad Ahmed and Pellizzoni Rodolfo. 2016. Trading cores for memory bandwidth in real-time systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 111.Google ScholarGoogle Scholar
  3. [3] Amdahl Gene M.. 1967. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18-20, 1967, Spring Joint Computer Conference. 483485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Andersson Björn and Niz Dionisio de. 2012. Analyzing global-EDF for multiprocessor scheduling of parallel tasks. In International Conference on Principles Of Distributed Systems. Springer, 1630.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] ARM. 2018. Memory System Resource Partitioning and Monitoring (MPAM). (2018). https://developer.arm.com/documentation/ddi0598/latest/.Google ScholarGoogle Scholar
  6. [6] Awan Muhammad Ali, Bletsas Konstantinos, Souto Pedro F., Akesson Benny, and Tovar Eduardo. 2017. Mixed-criticality scheduling with dynamic redistribution of shared cache. In Euromicro Conference on Real-Time Systems (ECRTS). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.Google ScholarGoogle Scholar
  7. [7] Baruah Sanjoy. 2015. Federated scheduling of sporadic DAG task systems. In International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 179186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Baruah Sanjoy. 2016. The federated scheduling of systems of mixed-criticality sporadic DAG tasks. In IEEE Real-Time Systems Symposium (RTSS). 227236. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Bienia Christian. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University. http://parsec.cs.princeton.edu. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Bonifaci Vincenzo, Marchetti-Spaccamela Alberto, Stiller Sebastian, and Wiese Andreas. 2013. Feasibility analysis in the sporadic DAG task model. In Euromicro Conference on Real-Time Systems (ECRTS). 225233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Chen Gang, Hu Biao, Huang Kai, Knoll Alois, Liu Di, and Stefanov Todor. 2014. Automatic cache partitioning and time-triggered scheduling for real-time MPSoCs. In International Conference on ReConFigurable Computing and FPGAs (ReConFig). IEEE, 18.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Chwa Hoon Sung, Lee Jinkyu, Phan Kieu-My, Easwaran Arvind, and Shin Insik. 2013. Global EDF schedulability analysis for synchronous parallel tasks on multicore platforms. In Euromicro Conference on Real-Time Systems (ECRTS). 2534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Funaro Liran, Ben-Yehuda Orna Agmon, and Schuster Assaf. 2016. Ginseng: Market-driven LLC allocation. In USENIX Annual Technical Conference (ATC). 295308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Gamrath Gerald, Anderson Daniel, Bestuzheva Ksenia, Chen Wei-Kun, Eifler Leon, Gasse Maxime, Gemander Patrick, Gleixner Ambros, Gottwald Leona, Halbig Katrin, et al. 2020. The SCIP Optimization Suite 7.0. In Technical Report.Google ScholarGoogle Scholar
  15. [15] Gracioli Giovani, Alhammad Ahmed, Mancuso Renato, Fröhlich Antônio Augusto, and Pellizzoni Rodolfo. 2015. A survey on cache management mechanisms for real-time embedded systems. Computing Surveys (CSUR) 48, 2 (2015), 136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Guan Nan, Stigge Martin, Yi Wang, and Yu Ge. 2009. Cache-aware scheduling and analysis for multicores. In International Conference on Embedded Software (EMSOFT). ACM, 245254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Guo Danlu, Hassan Mohamed, Pellizzoni Rodolfo, and Patel Hiren. 2018. A comparative study of predictable DRAM controllers. Transactions on Embedded Computing Systems (TECS) 17, 2 (2018), 123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Hassan Mohamed, Patel Hiren, and Pellizzoni Rodolfo. 2015. A framework for scheduling DRAM memory accesses for multi-core mixed-time critical systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 307316.Google ScholarGoogle Scholar
  19. [19] Intel. 2013. Intel CilkPlus v1.2. (Sep 2013). https://www.cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_1.2.htm.Google ScholarGoogle Scholar
  20. [20] Intel. 2019. User space software for Intel(R) Resource Director Technology. (2019). https://github.com/intel/intel-cmt-cat.Google ScholarGoogle Scholar
  21. [21] Jiang Xu, Guan Nan, Long Xiang, and Yi Wang. 2017. Semi-federated scheduling of parallel real-time tasks on multiprocessors. In Real-Time Systems Symposium (RTSS). IEEE, 8091.Google ScholarGoogle Scholar
  22. [22] Jiang Xu, Long Xiang, Guan Nan, and Wan Han. 2016. On the decomposition-based global EDF scheduling of parallel real-time tasks. In Real-Time Systems Symposium (RTSS). IEEE, 237246.Google ScholarGoogle Scholar
  23. [23] Kim Hyoseung, Niz Dionisio De, Andersson Björn, Klein Mark, Mutlu Onur, and Rajkumar Ragunathan. 2014. Bounding memory interference delay in COTS-based multi-core systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 145154.Google ScholarGoogle Scholar
  24. [24] Kim Hyoseung and Rajkumar Ragunathan. 2016. Real-time cache management for multi-core virtualization. In International Conference on Embedded Software (EMSOFT). IEEE, 110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Kim Junsung, Kim Hyoseung, Lakshmanan Karthik, and Rajkumar Ragunathan Raj. 2013. Parallel scheduling for cyber-physical systems: Analysis and case study on a self-driving car. In 4th International Conference on Cyber-Physical Systems (ICCPS). 3140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Lakshmanan Karthik, Kato Shinpei, and Rajkumar Ragunathan. 2010. Scheduling parallel real-time tasks on multi-core processors. In 31st IEEE Real-Time Systems Symposium (RTSS). 259268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Li Jing, Agrawal Kunal, Lu Chenyang, and Gill Christopher. 2013. Analysis of global EDF for parallel tasks. In 25th Euromicro Conference on Real-Time Systems (ECRTS). 313.Google ScholarGoogle Scholar
  28. [28] Li J., Chen Jian-Jia, Agrawal K., Lu C., Gill C. D., and Saifullah Abusayeed. 2014. Analysis of federated and global scheduling for parallel real-time tasks. In 26th Euromicro Conference on Real-Time Systems (ECRTS). 8596. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Li Jing, Dinh Son, Kieselbach Kevin, Agrawal Kunal, Gill Christopher, and Lu Chenyang. 2016. Randomized work stealing for large scale soft real-time systems. In IEEE Real-Time Systems Symposium (RTSS). 203214.Google ScholarGoogle Scholar
  30. [30] Li Yonghui, Akesson Benny, and Goossens Kees. 2016. Architecture and analysis of a dynamically-scheduled real-time memory controller. Real-Time Systems 52, 5 (2016), 675729. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] McCalpin John D. et al. 1995. Memory bandwidth and machine balance in current high performance computers. Computer Society Technical Committee on Computer Architecture (TCCA) newsletter 2, 19–25 (1995).Google ScholarGoogle Scholar
  32. [32] Melani Alessandra, Bertogna Marko, Bonifaci Vincenzo, Marchetti-Spaccamela Alberto, and Buttazzo Giorgio. 2016. Schedulability analysis of conditional parallel task graphs in multicore systems. IEEE Trans. Comput. 66, 2 (2016), 339353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Nelissen Geoffrey, Berten Vandy, Goossens Joël, and Milojevic Dragomir. 2012. Techniques optimizing the number of processors to schedule multi-threaded tasks. In 24th Euromicro Conference on Real-Time Systems (ECRTS). 321330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Nguyen Viet Anh, Hardy Damien, and Puaut Isabelle. 2019. Cache-conscious off-line real-time scheduling for multi-core platforms: Algorithms and implementation. Real-Time Systems 55, 4 (2019), 810849.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] OpenMP. 2013. OpenMP Application Program Interface v4.0. (July 2013). http://http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf.Google ScholarGoogle Scholar
  36. [36] PBBS. 2014. Problem Based Benchmark Suite. (2014). http://www.cs.cmu.edu/pbbs.Google ScholarGoogle Scholar
  37. [37] Pellizzoni Rodolfo and Yun Heechul. 2016. Memory servers for multicore systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 112.Google ScholarGoogle Scholar
  38. [38] Saifullah Abusayeed, Li Jing, Agrawal Kunal, Lu Chenyang, and Gill Christopher. 2013. Multi-core real-time scheduling for generalized parallel task models. Real-Time Systems 49, 4 (2013), 404435.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Sarkar Abhik, Mueller Frank, and Ramaprasad Harini. 2015. Static task partitioning for locked caches in multicore real-time systems. Transactions on Embedded Computing Systems (TECS) 14, 1 (2015), 130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Satyanarayanan Mahadev. 2017. The emergence of edge computing. Computer 50, 1 (2017), 3039. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Shi Weisong, Cao Jie, Zhang Quan, Li Youhuizi, and Xu Lanyu. 2016. Edge computing: Vision and challenges. Internet of Things Journal 3, 5 (2016), 637646.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Sohal Parul, Tabish Rohan, Drepper Ulrich, and Mancuso Renato. 2020. E-WarP: A system-wide framework for memory bandwidth profiling and management. In Real-Time Systems Symposium (RTSS). IEEE, 345357.Google ScholarGoogle Scholar
  43. [43] Tessler Corey, Modekurthy Venkata P., Fisher Nathan, and Saifullah Abusayeed. 2020. Bringing inter-thread cache benefits to federated scheduling. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 281295.Google ScholarGoogle Scholar
  44. [44] Ueter Niklas, Bruggen Georg von der, Chen Jian-Jia, Li Jing, and Agrawal Kunal. 2018. Reservation-based federated scheduling for parallel real-time tasks. In IEEE Real-Time Systems Symposium (RTSS). 482494.Google ScholarGoogle Scholar
  45. [45] Valsan Prathap Kumar, Yun Heechul, and Farshchi Farzad. 2016. Taming non-blocking caches to improve isolation in multicore real-time systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 112.Google ScholarGoogle Scholar
  46. [46] Wang Qi and Parmer Gabriel. 2014. FJOS: Practical, predictable, and efficient system support for fork/join parallelism. In Real-Time and Embedded Technology and Applications Symposium (RTAS), IEEE 20th. 2536.Google ScholarGoogle Scholar
  47. [47] Xiao Jun, Altmeyer Sebastian, and Pimentel Andy. 2017. Schedulability analysis of non-preemptive real-time scheduling for multicore processors with shared caches. In Real-Time Systems Symposium (RTSS). IEEE, 199208.Google ScholarGoogle Scholar
  48. [48] Xu Meng, Phan Linh Thi Xuan, Choi Hyon-Young, Lin Yuhan, Li Haoran, Lu Chenyang, and Lee Insup. 2019. Holistic resource allocation for multicore real-time systems. In 2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 345356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Ye Ying, West Richard, Zhang Jingyi, and Cheng Zhuoqun. 2016. Maracas: A real-time multicore vCPU scheduling framework. In Real-Time Systems Symposium (RTSS). IEEE, 179190.Google ScholarGoogle Scholar
  50. [50] Yun Heechul, Ali Waqar, Gondi Santosh, and Biswas Siddhartha. 2016. Bwlock: A dynamic memory access control framework for soft real-time applications on multicore platforms. Transactions on Computers (TC) 66, 7 (2016), 12471252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Yun Heechul, Mancuso Renato, Wu Zheng-Pei, and Pellizzoni Rodolfo. 2014. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 155166.Google ScholarGoogle Scholar
  52. [52] Yun Heechul, Yao Gang, Pellizzoni Rodolfo, Caccamo Marco, and Sha Lui. 2013. Memguard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 5564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Yun Heechul, Yao Gang, Pellizzoni Rodolfo, Caccamo Marco, and Sha Lui. 2015. Memory bandwidth management for efficient performance isolation in multi-core platforms. Transactions on Computers (TC) 65, 2 (2015), 562576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Zhang Xiao, Dwarkadas Sandhya, and Shen Kai. 2009. Towards practical page coloring-based multicore cache management. In European Conference on Computer Systems. ACM, 89102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. [55] Zhou Yanqi and Wentzlaff David. 2016. MITTS: Memory inter-arrival time traffic shaping. SIGARCH Computer Architecture News 44, 3 (2016), 532544. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Zuepke Alexander and Kaiser Robert. 2019. Deterministic futexes: Addressing WCET and bounded interference concerns. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 6576.Google ScholarGoogle Scholar

Index Terms

  1. Holistic Resource Allocation Under Federated Scheduling for Parallel Real-time Tasks

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Embedded Computing Systems
            ACM Transactions on Embedded Computing Systems  Volume 21, Issue 1
            January 2022
            288 pages
            ISSN:1539-9087
            EISSN:1558-3465
            DOI:10.1145/3505211
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 14 January 2022
            • Accepted: 1 September 2021
            • Revised: 1 August 2021
            • Received: 1 February 2021
            Published in tecs Volume 21, Issue 1

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Refereed
          • Article Metrics

            • Downloads (Last 12 months)129
            • Downloads (Last 6 weeks)5

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          View Full Text

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!