Abstract
The EC project parMERASA (Multicore Execution of Parallelized Hard Real-Time Applications Supporting Analyzability) investigated timing-analyzable parallel hard real-time applications running on a predictable multicore processor. A pattern-supported parallelization approach was developed to ease sequential to parallel program transformation based on parallel design patterns that are timing analyzable. The parallelization approach was applied to parallelize the following industrial hard real-time programs: 3D path planning and stereo navigation algorithms (Honeywell International s.r.o.), control algorithm for a dynamic compaction machine (BAUER Maschinen GmbH), and a diesel engine management system (DENSO AUTOMOTIVE Deutschland GmbH). This article focuses on the parallelization approach, experiences during parallelization with the applications, and quantitative results reached by simulation, by static WCET analysis with the OTAWA tool, and by measurement-based WCET analysis with the RapiTime tool.
- ARINC Inc. 2012. ARINC Specification 653: Avionics Application Software Standard Interface, Part 1 and 4, Subset Services.Google Scholar
- AUTOSAR. 2014. Retrieved April 13, 2016 from http://www.autosar.org/.Google Scholar
- Clément Ballabriga, Hugues Cassé, Christine Rochange, and Pascal Sainrat. 2011. OTAWA: An open toolbox for adaptive WCET analysis. In Software Technologies for Embedded and Ubiquitous Systems, SangLyul Min, Robert Pettit, Peter Puschner, and Theo Ungerer (Eds.). Lecture Notes in Computer Science, Vol. 6399. Springer, Berlin, 35--46. DOI:http://dx.doi.org/10.1007/978-3-642-16256-5_6Google Scholar
- Tobias Bjerregaard and Jens Sparso. 2005. A scheduling discipline for latency and bandwidth guarantees in asynchronous network-on-chip. In Symposium on Asynchronous Circuits and Systems (ASYNC’05). 34--43. Google Scholar
Digital Library
- Christian Bradatsch and Florian Kluge. 2013. parMERASA Multi-Core RTOS Kernel. Technical Report no. 2013-02. University of Augsburg, Augsburg, Germany.Google Scholar
- Christian Bradatsch, Florian Kluge, and Theo Ungerer. 2013. A cross-domain system architecture for embedded hard real-time many-core systems. In 11th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC’13). IEEE.Google Scholar
Cross Ref
- Francisco J. Cazorla, Eduardo Quiñones, Tullio Vardanega, Liliana Cucu, Benoit Triquet, Guillem Bernat, Emery Berger, Jaume Abella, Franck Wartel, Michael Houston, Luca Santinelli, Leonidas Kosmidis, Code Lo, and Dorin Maxim. 2013. PROARTIS: Probabilistically analyzable real-time systems. ACM Transactions on Embedded Computing Systems 12, 2s, Article 94, 26 pages. DOI:http://dx.doi.org/10.1145/2465787.2465796 Google Scholar
Digital Library
- Bernard Cole. 2015. Effective code coverage comes to multicore software. EETimes. Retrieved April 13, 2016 from http://www.eetimes.com/document.asp?doc_id=1326496.Google Scholar
- Antoine Colin and Isabelle Puaut. 2001. A modular and retargetable framework for tree-based WCET analysis. In 13th Euromicro Conference on Real-Time Systems. IEEE, 37--44. Google Scholar
Digital Library
- Benoît Dupont de Dinechin, Pierre Guironnet de Massas, Guillaume Lager, Clément Léger, Benjamin Orgogozo, Jérôme Reybert, and Thierry Strudel. 2013. A distributed run-time environment for the Kalray MPPA-256 integrated manycore processor. Procedia Computer Science 18, Complete, 1654--1663.Google Scholar
- Heiko Falk and Paul Lokuciejewski. 2010. A compiler framework for the reduction of worst-case execution times. Real-Time Systems 46, 2, 251--300. Google Scholar
Digital Library
- I. Foster. 1995. Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison-Wesley Longman Publishing Co., Inc., Boston, MA. Google Scholar
Digital Library
- Antoine Fraboulet, Tanguy Risset, and Antoine Scherrer. 2004. Cycle accurate simulation model generation for soc prototyping. In Computer Systems: Architectures, Modeling, and Simulation. Springer, 453--462.Google Scholar
- Mike Gerdes, Ralf Jahr, and Theo Ungerer. 2013. parMERASA Pattern Catalogue: Timing Predictable Parallel Design Patterns. Technical Report no. 2013-11. University of Augsburg, Augsburg, Germany.Google Scholar
- Mike Gerdes, Florian Kluge, Theo Ungerer, Christine Rochange, and Pascal Sainrat. 2012. Time analysable synchronisation techniques for parallelised hard real-time applications. In Design, Automation & Test in Europe Conference & Exhibition (DATE’& Exhibition (DATE’’12). 671--676. DOI:http://dx.doi.org/10.1109/DATE.2012.6176555 Google Scholar
Digital Library
- Mike Gerdes, Julian Wolf, Irakli Guliashvili, Theo Ungerer, Michael Houston, Guillem Bernat, Stefan Schnitzler, and Hans Regler. 2011. Large drilling machine control code-parallelisation and WCET speedup. In IEEE International Symposium on Industrial Embedded Systems (SIES’11). IEEE, 91--94.Google Scholar
Cross Ref
- Kees Goossens and Andreas Hansson. 2010. The aethereal network on chip after ten years: Goals, evolution, lessons, and future. In Design Automation Conference (DAC’10). 306--311. Google Scholar
Digital Library
- Reinhold Heckmann and Christian Ferdinand. 2004. Worst-case execution time prediction by static program analysis. In 18th International Parallel and Distributed Processing Symposium (IPDPS’04). IEEE Computer Society. 26--30.Google Scholar
- Infineon Technologies AG 2008. TriCore 1 Architecture Volume 1: Instruction Set V1.3 & V1.3.1. Infineon Technologies AG.Google Scholar
- ISO. 2011. Road Vehicles -- Functional Safety -- Part 6: Product Development at the Software Level, Ref. Num. ISO 26262-6:2011(E).Google Scholar
- Ralf Jahr, Martin Frieb, Mike Gerdes, and Theo Ungerer. 2014. Model-based parallelization and optimization of an industrial control code. In Tagungsband des Dagstuhl-Workshops MBEES: Modellbasierte Entwicklung eingebetteter Systeme X. 63--72.Google Scholar
- Ralf Jahr, Mike Gerdes, and Theo Ungerer. 2013a. On efficient and effective model-based parallelization of hard real-time applications. In Dagstuhl-Workshop MBEES: Modellbasierte Entwicklung eingebetteter Systeme IX. 50--59.Google Scholar
- Ralf Jahr, Mike Gerdes, and Theo Ungerer. 2013b. A pattern-supported parallelization approach. In Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM’13). 53--62. DOI:http://dx.doi.org/10.1145/2442992.2442998 Google Scholar
Digital Library
- Ralf Jahr, Mike Gerdes, Theo Ungerer, Haluk Ozaktas, Christine Rochange, and Pavel G. Zaykov. 2014a. Effects of structured parallelism by parallel design patterns on embedded hard real-time systems. In IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’14). 1--10. DOI:http://dx.doi.org/10.1109/RTCSA.2014.6910546Google Scholar
- Ralf Jahr, Alexander Stegmeier, Rolf Kiefhaber, Martin Frieb, and Theo Ungerer. 2014b. User Manual for the Optimization and WCET Analysis of Software with Timing Analyzable Algorithmic Skeletons. Technical Report no. 2014-05. University of Augsburg, Augsburg, Germany.Google Scholar
- Ben Lickly, Isaac Liu, Sungjun Kim, Hiren D. Patel, Stephen A. Edwards, and Edward A. Lee. 2008. Predictable programming on a precision timed architecture. In Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’08). 137--146. Google Scholar
Digital Library
- Isaac Liu, Jan Reineke, David Broman, Michael Zimmer, and Edward A. Lee. 2012. PRET microarchitecture implementation with repeatable timing and competitive performance. In International Conference on Computer Design (ICCD’12). Google Scholar
Digital Library
- Robert G. Lukas. 1995. Dynamic compaction. Geotechnical Engineering Circular No. 1, FHWA-SA-95-037, 1--97. http://isddc.dot.gov/OLPFiles/FHWA/009754.pdf.Google Scholar
- Timothy Mattson, Beverly Sanders, and Berna Massingill. 2004. Patterns for Parallel Programming (1st ed.). Addison-Wesley Professional, Indianapolis, IN. Google Scholar
Digital Library
- Timothy G. Mattson, Michael Riepen, Thomas Lehnig, Paul Brett, Werner Haas, Patrick Kennedy, Jason Howard, Sriram Vangal, Nitin Borkar, Greg Ruhl, and Saurabh Dighe. 2010. The 48-core SCC processor: The programmer’s view. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC’10). 1--11. Google Scholar
Digital Library
- Ivan Miro-Panades, Fabien Clermidy, Pascal Vivet, and Alain Greiner. 2008. Physical implementation of the DSPIN network-on-chip in the FAUST architecture. In International Symposium on Networks on Chip (NOCS’08). 139--148. Google Scholar
Digital Library
- Jörg Mische, Irakli Guliashvili, Sascha Uhrig, and Theo Ungerer. 2010. How to enhance a superscalar processor to provide hard real-time capable in-order SMT. In 23rd International Conference on Architecture of Computing Systems (ARCS’10). Hannover, Germany, 2--14. Google Scholar
Digital Library
- Jörg Mische and Theo Ungerer. 2014. Guaranteed service independent of the task placement in NoCs with torus topology. In 22nd International Conference on Real-Time Networks and Systems (RTNS’14). Google Scholar
Digital Library
- Haluk Ozaktas, Christine Rochange, and Pascal Sainrat. 2013. Automatic WCET analysis of real-time parallel applications. In OASIcs-OpenAccess Series in Informatics, Vol. 30. Schloss Dagstuhl -- Leibniz Center for Informatics.Google Scholar
- Milos Panic, Sebastian Kehr, Eduardo Quiñones, Bert Böddeker, Jaume Abella, and Francisco J. Cazorla. 2014a. RunPar: An allocation algorithm for automotive applications exploiting runnable parallelism in multicores. In 12th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’14). Google Scholar
Digital Library
- Milos Panic, Eduardo Quiñones, Pavel G. Zaykov, Carles Hernandez, Jaume Abella, and Francisco J. Cazorla. 2014b. Parallel many-core avionics systems. In ACM International Conference on Embedded Software (EMSOFT’14). Google Scholar
Digital Library
- Marco Paolieri, Eduardo Quiñones, and Francisco J. Cazorla. 2013. Timing effects of DDR memory systems in hard real-time multicore architectures: Issues and solutions. ACM Transactions on Embedded Computing Systems (TECS’13) 12, 1s, 64. Google Scholar
Digital Library
- Christof Pitter and Martin Schoeberl. 2010. A real-time Java chip-multiprocessor. ACM Transactions on Embedded Computing Systems 10, 1. Google Scholar
Digital Library
- Arthur Pyka, Mathias Rohde, and Sascha Uhrig. 2013. A real-time capable first-level cache for multi-cores. In 3rd Workshop on High Performance and Real-Time Embedded Systems (HiRES’13).Google Scholar
- Arthur Pyka, Mathias Rohde, and Sascha Uhrig. 2014a. Extended performance analysis of the time predictable on-demand coherent data cache for multi- and many-core systems. In International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS’14). 107--114. DOI:http://dx.doi.org/10.1109/SAMOS.2014.6893201Google Scholar
Cross Ref
- Arthur Pyka, Mathias Rohde, and Sascha Uhrig. 2014b. A real-time capable coherent data cache for multicores. Concurrency and Computation: Practice and Experience 26, 6, 1342--1354. DOI:http://dx.doi.org/10.1002/cpe.3172 Google Scholar
Digital Library
- Arthur Pyka, Lilian Tadros, Sascha Uhrig, Hugues Cassé, Haluk Ozaktas, and Christine Rochange. 2015. WCET analysis of parallel benchmarks using on-demand coherent cache. In 3rd Workshop on High-Performance and Real-Time Embedded Systems (HIRES’15).Google Scholar
- Christine Rochange, Armelle Bonenfant, Pascal Sainrat, Mike Gerdes, Julian Wolf, Theo Ungerer, Zlatko Petrov, and Frantisek Mikulu. 2010. WCET analysis of a parallel 3D multigrid solver executed on the MERASA multi-core. In Workshop on Worst-Case Execution Time Analysis (WCET’10). 90--100.Google Scholar
- Martin Schoeberl. 2008. A Java processor architecture for embedded real-time systems. Journal of Systems Architecture 54, 1, 265--286. Google Scholar
Digital Library
- Martin Schoeberl, Sahar Abbaspour, Benny Akesson, Neil Audsley, Raffaele Capasso, Jamie Garside, Kees Goossens, Sven Goossens, Scott Hansen, Reinhold Heckmann, Stefan Hepp, Benedikt Huber, Alexander Jordan, Evangelia Kasapaki, Jens Knoop, Yonghui Li, Daniel Prokesch, Wolfgang Puffitsch, Peter Puschner, André Rocha, Cláudio Silva, Jens Spars, and Alessandro Tocchi. 2015. T-CREST: Time-predictable multi-core architecture for embedded systems. Journal of Systems Architecture. Google Scholar
Digital Library
- Martin Schoeberl, Pascal Schleuniger, Wolfgang Puffitsch, Florian Brandner, Christian W. Probst, Sven Karlsson, and Tommy Thorn. 2011. Towards a time-predictable dual-issue microprocessor: The Patmos approach. In Workshop on Bringing Theory to Practice: Predictability and Performance in Embedded Systems (PPES’11).Google Scholar
- Andreas Schranzhofer, Rodolfo Pellizzoni, Jian-Jia Chen, Lothar Thiele, and Marco Caccamo. 2010. Worst-case response time analysis of resource access models in multi-core systems. In 47th ACM/IEEE Design Automation Conference (DAC’10). 332--337. Google Scholar
Digital Library
- Radu Stefan, Anca Molnos, and Kees Goossens. 2014. dAElite: A TDM NoC supporting QoS, multicast, and fast connection set-up. IEEE Transactions on Computers 63, 3, 583--594. Google Scholar
Digital Library
- Alexander Stegmeier, Martin Frieb, Ralf Jahr, and Theo Ungerer. 2015. Algorithmic skeletons for parallelization of embedded real-time systems. In 3rd Workshop on High-Performance and Real-Time Embedded Systems (HiRES’15).Google Scholar
- Lothar Thiele and Reinhard Wilhelm. 2004. Design for timing predictability. Real-Time Systems 28, 2--3, 157--177. Google Scholar
Digital Library
- Theo Ungerer, Francisco J. Cazorla, Pascal Sainrat, Guillem Bernat, Zlatko Petrov, Christine Rochange, Eduardo Quiñones, Mike Gerdes, Marco Paolieri, Julian Wolf, Hugues Cassé, Sascha Uhrig, Irakli Guliashvili, Michael Houston, Florian Kluge, Stefan Metzlaff, and Jörg Mische. 2010. Merasa: Multicore execution of hard real-time applications supporting analyzability. IEEE Micro 30, 5, 66--75. DOI:http://dx.doi.org/10.1109/MM.2010.78 Google Scholar
Digital Library
- Theo Ungerer, Borut Robič, and Jurij Šilc. 2003. A survey of processors with explicit multithreading. ACM Computing Surveys 35, 1, 29--63. Google Scholar
Digital Library
- Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter Puschner, Jan Staschulat, and Per Stenström. 2008. The worst-case execution-time problem—overview of methods and survey of tools. ACM Transactions on Embedded Computing Systems 7, 3, Article 36, 53 pages. DOI:http://dx.doi.org/10.1145/1347375.1347389 Google Scholar
Digital Library
- Reinhard Wilhelm, Daniel Grund, Jan Reineke, Marc Schlickling, Markus Pister, and Christian Ferdinand. 2009. Memory hierarchies, pipelines, and buses for future architectures in time-critical embedded systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 28, 7, 966--978. Google Scholar
Digital Library
Index Terms
Parallelizing Industrial Hard Real-Time Applications for the parMERASA Multicore
Recommendations
parMERASA -- Multi-core Execution of Parallelised Hard Real-Time Applications Supporting Analysability
DSD '13: Proceedings of the 2013 Euromicro Conference on Digital System DesignEngineers who design hard real-time embedded systems express a need for several times the performance available today while keeping safety as major criterion. A breakthrough in performance is expected by parallelizing hard real-time applications and ...
Hardware support for WCET analysis of hard real-time multicore systems
ISCA '09: Proceedings of the 36th annual international symposium on Computer architectureThe increasing demand for new functionalities in current and future hard real-time embedded systems like automotive, avionics and space industries is driving an increase in the performance required in embedded processors. Multicore processors represent ...
Hardware support for WCET analysis of hard real-time multicore systems
The increasing demand for new functionalities in current and future hard real-time embedded systems like automotive, avionics and space industries is driving an increase in the performance required in embedded processors. Multicore processors represent ...






Comments