Abstract
Nowadays, many real-time applications are very complex and as the complexity and the requirements of those systems become more demanding, more hardware processing capacity is necessary. Unfortunately, the correct functioning of real-time systems depends not only on the logically correct response but also on the time when it is produced. General-purpose processor design fails to deliver analyzability due to their nondeterministic behavior caused by the use of cache memories, dynamic branch prediction, speculative execution, and out-of-order pipelines. In this article, we investigate the pipeline performance of Very Long Instruction Word (VLIW) architectures for real-time systems with an in-order pipeline considering Worst-Case Execution Time (WCET) performance. Techniques on obtaining the WCET of VLIW machines are also considered and we make a quantification on how important are hardware techniques such as static branch prediction, predication, and pipeline speed of complex operations such as memory access and multiplication for high-performance real-time systems. The memory hierarchy is out of the scope of this article and we used a classic deterministic structure formed by a direct mapped instruction cache and a data scratchpad memory. A VLIW prototype was implemented in VHDL from scratch considering the HP VLIW ST231 ISA. We also show some compiler insights and we use a representative subset of the Mälardalen’s WCET benchmarks for validation and performance quantification.
- Martin Alt, Christian Ferdinand, Florian Martin, and Reinhard Wilhelm. 1996. Cache behavior prediction by abstract interpretation. In Static Analysis. 52--66. DOI:http://dx.doi.org/10.1007/3-540-61739-6_33 Google Scholar
Digital Library
- C. Burguiere, Christine Rochange, and Pascal Sainrat. 2005. A case for static branch prediction in real-time systems. In Proceedings of the 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’05). IEEE, New York, NY, 33--38. DOI:http://dx.doi.org/10.1109/RTCSA.2005.5 Google Scholar
Digital Library
- Christoph Cullmann, Christian Ferdinand, Gernot Gebhard, Daniel Grund, C. Maiza, Jan Reineke, Benoît Triquet, and Reinhard Wilhelm. 2010. Predictability considerations in the design of multi-core embedded systems. Ingenieurs de l’Automobile 807 (2010), 36--42.Google Scholar
- John R. Ellis. 1985. Bulldog: A Compiler for Vliw Architectures (Parallel Computing, Reduced-instruction-set, Trace Scheduling, Scientific). Ph.D. Dissertation. New Haven, CT. Google Scholar
Digital Library
- Jakob Engblom and Bengt Jonsson. 2002. Processor pipelines and their properties for static WCET analysis. In Embedded Software. 334--348. DOI:http://dx.doi.org/10.1007/3-540-45828-X_25 Google Scholar
Digital Library
- Joseph A. Fisher, Paolo Faraboshi, and Cliff Young. 2005. Emebedded Computing: A VLIW Approach to Architecture Compilers and Tools. Morgan Kaufmann Publishers. 709 pages. Google Scholar
Digital Library
- Joseph A. Fisher and Student Member. 1981. Trace scheduling: A technique for global microcode compaction. IEEE Trans. Comput. C-30, 7 (July 1981), 478--490. DOI:http://dx.doi.org/10.1109/TC.1981.1675827 Google Scholar
Digital Library
- Jan Gustafsson. 2008. Usability aspects of WCET analysis. In Proceedings of the 2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC). 346--352. DOI:http://dx.doi.org/10.1109/ISORC.2008.55 Google Scholar
Digital Library
- Jan Gustafsson, Adam Betts, Andreas Ermedahl, and Björn Lisper. 2010. The m{ä}lardalen WCET benchmarks - Past, present and future. In Proc. of the 10th International Workshop on Worst-Case Execution Time Analysis.Google Scholar
- Wen Mei W. Hwu, Scott A. Mahlke, William Y. Chen, Pohua P. Chang, Nancy J. Warter, Roger A. Bringmann, Roland G. Ouellette, Richard E. Hank, Tokuzo Kiyohara, Grant E. Haab, John G. Holm, and Daniel M. Lavery. 1993. The superblock: An effective technique for VLIW and superscalar compilation. J. Supercomput. 7, 1--2 (May 1993), 229--248. DOI:http://dx.doi.org/10.1007/BF01205185 Google Scholar
Digital Library
- Alexander Jordan, Nikolai Kim, and Andreas Krall. 2013. IR-level versus machine-level if-conversion for predicated architectures. In Proceedings of the 10th Workshop on Optimizations for DSP and Embedded Systems (ODES’13). ACM, New York, NY, 3. DOI:http://dx.doi.org/10.1145/2443608.2443611 Google Scholar
Digital Library
- C. E. LaForest and J. G. Steffan. 2010. Efficient multi-ported memories for FPGAs. In Proceedings of the ACM/SIGDA 18th International Symposium on Field Programmable Gate Arrays. 41. DOI:http://dx.doi.org/10.1145/1723112.1723122 Google Scholar
Digital Library
- C. Lattner and V. Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization. IEEE, 75--86. DOI:http://dx.doi.org/10.1109/CGO.2004.1281665 Google Scholar
Digital Library
- Chang-gun Lee, Joosun Hahn, Yang-min Seo, Sang Lyul Min, Rhan Ha, Seongsoo Hong, Chang Yun Park, Minsuk Lee, and Chong Sang Kim. 1998. Analysis of cache-related preemption delay in fixed-priority preemptive scheduling. IEEE Trans. Comput. 47, 6 (June 1998), 700--713. DOI:http://dx.doi.org/10.1109/12.689649 Google Scholar
Digital Library
- Thomas Lengauer and Robert Endre Tarjan. 1979. A fast algorithm for finding dominators in a flowgraph. ACM Trans. Program. Languages Syst. 1, 1 (Jan. 1979), 121--141. DOI:http://dx.doi.org/10.1145/357062.357071 Google Scholar
Digital Library
- Yau-Tsun Steven Li and Sharad Malik. 1995. Performance analysis of embedded software using implicit path enumeration. ACM SIGPLAN Not. 30, 11 (Nov. 1995), 88--98. DOI:http://dx.doi.org/10.1145/216633.216666 Google Scholar
Digital Library
- Isaac Liu, Jan Reineke, David Broman, Michael Zimmer, and Edward a. Lee. 2012. A PRET microarchitecture implementation with repeatable timing and competitive performance. In Proceedings of the 2012 IEEE 30th International Conference on Computer Design (ICCD). 87--93. DOI:http://dx.doi.org/10.1109/ICCD.2012.6378622 Google Scholar
Digital Library
- Isaac Liu, Jan Reineke, and Edward A. Lee. 2010. A PRET architecture supporting concurrent programs with composable timing properties. In Proceedings of the 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers. 2111--2115. DOI:http://dx.doi.org/10.1109/ACSSC.2010.5757922Google Scholar
Cross Ref
- Patricia López Martínez, Cesar Cuevas, and José M. Drake. 2012. Compositional real-time models. J. Syst. Arch. 58, 6--7 (June 2012), 257--276. DOI:http://dx.doi.org/10.1016/j.sysarc.2012.04.001 Google Scholar
Digital Library
- Frank Mueller and D. B. Whalley. 1995. Fast instruction cache analysis via static cache simulation. In Proceedings of Simulation Symposium. IEEE, 105--114. DOI:http://dx.doi.org/10.1109/SIMSYM.1995.393589 Google Scholar
Digital Library
- Greger Ottosson and Mikael Sjödin. 1997. Worst-case execution time analysis for modern hardware architectures. In Proceedings of the SIGPLAN Workshop on Languages, Compilers and Tools for Real-Time Systems.Google Scholar
- P. Puschner. 2005. Experiments with WCET-oriented programming and the single-path architecture. In Proceedings of the10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems. 205--210. DOI:http://dx.doi.org/10.1109/WORDS.2005.36 Google Scholar
Digital Library
- Peter Puschner, Raimund Kirner, and Robert G. Pettit. 2009. Towards composable timing for real-time programs. In 2009 Software Technologies for Future Dependable Distributed Systems. DOI:http://dx.doi.org/10.1109/STFSSD.2009.26 Google Scholar
Digital Library
- Jan Reineke, Isaac Liu, Hiren D. Patel, Sungjun Kim, and Edward A. Lee. 2011. PRET DRAM controller. In Proceedings of the 7th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’11). ACM Press, New York, NY, 99. DOI:http://dx.doi.org/10.1145/2039370.2039388Google Scholar
- Martin Schoeberl. 2012. Is time predictability quantifiable? In Proceedings of the 2012 International Conference on Embedded Computer Systems (SAMOS). 333--338. DOI:http://dx.doi.org/10.1109/SAMOS.2012.6404196Google Scholar
Cross Ref
- Martin Schoeberl, Sahar Abbaspour, Benny Akesson, Neil Audsley, Raffaele Capasso, Jamie Garside, Kees Goossens, Sven Goossens, Scott Hansen, Reinhold Heckmann, Stefan Hepp, Benedikt Huber, Alexander Jordan, Evangelia Kasapaki, Jens Knoop, Yonghui Li, Daniel Prokesch, Wolfgang Puffitsch, Peter Puschner, André Rocha, Cláudio Silva, Jens Sparsø, and Alessandro Tocchi. 2015. T-CREST: Time-predictable multi-core architecture for embedded systems. J. Syst. Arch. 61, 9 (Oct. 2015), 449--471. DOI:http://dx.doi.org/10.1016/j.sysarc.2015.04.002 Google Scholar
Digital Library
- Martin Schoeberl, Pascal Schleuniger, Wolfgang Puffitsch, Florian Brandner, and Christian W. Probst. 2011. Towards a time-predictable dual-issue microprocessor: The patmos approach. In Bringing Theory to Practice: Predictability and Performance in Embedded Systems (OpenAccess Series in Informatics (OASIcs)), Vol. 18. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 11--21. DOI:http://dx.doi.org/10.4230/OASIcs.PPES.2011.11Google Scholar
- John Paul Shen, Mikko H. Lipasti, John Paul. Sehn, and Mikko H. Lipasti. 2005. Modern Processor Design: Fundamentals of Superscalar Processors. McGraw-Hill, New York, NY, 642 pages.Google Scholar
- Renan Augusto Starke, Andreu Carminati, and Rômulo Silva de Oliveira. 2015. Investigating a four-issue deterministic VLIW architecture for real-time systems. In 13th IEEE International Conference on Industrial Informatics (INDIN). IEEE, Cambrigde - UK.Google Scholar
Cross Ref
- Theo Ungerer, Francisco Cazorla, Pascal Sainrat, Guillem Bernat, Zlatko Petrov, Christine Rochange, Eduardo Quinones, Mike Gerdes, Marco Paolieri, Julian Wolf, Hugues Casse, Sascha Uhrig, Irakli Guliashvili, Michael Houston, Floria Kluge, Stefan Metzlaff, and Jorg Mische. 2010. Merasa: Multicore execution of hard real-time applications supporting analyzability. IEEE Micro 30, 5 (2010), 66--75. DOI:http://dx.doi.org/10.1109/MM.2010.78 Google Scholar
Digital Library
- Jack Whitham and Neil Audsley. 2006. MCGREP--A predictable architecture for embedded real-time systems. In Proceedings of the 2006 27th IEEE International Real-Time Systems Symposium (RTSS’06). 13--24. DOI:http://dx.doi.org/10.1109/RTSS.2006.28 Google Scholar
Digital Library
- Stephan Wong, Thijs van As, and Geoffrey Brown. 2008. p-VEX: A reconfigurable and extensible softcore VLIW processor. In Proceedings of the International Conference on Field-Programmable Technology. 369--372. DOI:http://dx.doi.org/10.1109/FPT.2008.4762420Google Scholar
Cross Ref
- Jun Yan and Wei Zhang. 2008. A time-predictable VLIW processor and its compiler support. Real-Time Syst. 38, 1 (Jan 2008), 67--84. DOI:http://dx.doi.org/10.1007/s11241-007-9030-5 Google Scholar
Digital Library
Index Terms
Evaluating the Design of a VLIW Processor for Real-Time Systems
Recommendations
Tetris: a new register pressure control technique for VLIW processors
LCTES '07: Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systemsThe run-time performance of VLIW (very long instruction word) microprocessors depends heavily on the effectiveness of its associated optimizing compiler. Typical VLIW compiler phases include instruction scheduling, which maximizes instruction level ...
Tetris: a new register pressure control technique for VLIW processors
Proceedings of the 2007 LCTES conferenceThe run-time performance of VLIW (very long instruction word) microprocessors depends heavily on the effectiveness of its associated optimizing compiler. Typical VLIW compiler phases include instruction scheduling, which maximizes instruction level ...
A time-predictable VLIW processor and its compiler support
Time predictability is an important requirement for real-time embedded application domains such as automotive, air transportation, and multimedia processing. However, the architectural design of modern microprocessors mainly concentrates on improving ...






Comments