Abstract
Chip-multiprocessors are an emerging trend for embedded systems. In this article, we introduce a real-time Java multiprocessor called JopCMP. It is a symmetric shared-memory multiprocessor, and consists of up to eight Java Optimized Processor (JOP) cores, an arbitration control device, and a shared memory. All components are interconnected via a system on chip bus. The arbiter synchronizes the access of multiple CPUs to the shared main memory. In this article, three different arbitration policies are presented, evaluated, and compared with respect to their real-time and average-case performance: a fixed priority, a fair-based, and a time-sliced arbiter.
Tasks running on different CPUs of a chip-multiprocessor (CMP) influence each others' execution times when accessing a shared memory. Therefore, the system needs an arbiter that is able to limit the worst-case execution time of a task running on a CPU, even though tasks executing simultaneously on other CPUs access the main memory. Our research shows that timing analysis is in fact possible for homogeneous multiprocessor systems with a shared memory. The timing analysis of tasks, executing on the CMP using time-sliced memory arbitration, leads to viable worst-case execution time bounds.
The time-sliced arbiter divides the memory access time into equal time slots, one time slot for each CPU. This memory arbitration scheme allows for a calculation of upper bounds of Java application worst-case execution times, depending on the number of CPUs, the time slot size, and the memory access time. Examples of worst-case execution time calculation are presented, and the analyzed results of a real-world application task are compared to measured execution time results. Finally, we evaluate the tradeoffs when using a time-predictable solution compared to using average-case optimized chip-multiprocessors, applying three different benchmarks. These experiments are carried out by executing the programs on the CMP prototype.
- Altera. 2007a. Avalon memory-mapped interface specification (v3.3).Google Scholar
- Altera. 2007b. Nios II Processor Reference Handbook (ver. 7.2).Google Scholar
- Altera. 2007c. Quartus II Handbook, vol. 4: SOPC Builder (ver. 7.2).Google Scholar
- Andrei, A., Eles, P., Peng, Z., and Rosen, J. 2008. Predictable implementation of real-time applications on multiprocessor systems-on-chip. In Proceedings of the IEEE VLSI Design Conference. IEEE, Los Alamitos, 103--110. Google Scholar
Digital Library
- ARM. 2006. ARM 11, MPcore Processor, Technical Reference Manual. http://www.arm.com.Google Scholar
- ARM. 1999. AMBA specification (rev. 2.0).Google Scholar
- Artieri, A., D'Alto, V., Chesson, R., Hopkins, M., Rossi, M. C., and Peterson, W. D. 2004. Nomadik—Open multimedia platform for next generation mobile devices. Tech. rep. TA305 http://www.st.com.Google Scholar
- Dutta, S., Jensen, R., and Rieckmann, A. 2001. Viper: A multiprocessor SOC for advanced set-top box and digital TV systems. IEEE Des. Test Comput. 18, 5, 21--31. Google Scholar
Digital Library
- Ermedahl, A. and Engblom, J. 2007. Execution time analysis for embedded real-time systems. In Handbook of Real-Time Embedded Systems, S.H.S. Insup Lee and J.Y.-T. Leung Eds., Chapman & Hall/CRC, 35.1--35.17.Google Scholar
- Gaisler, J. and Catovic, E. 2006. Multi-core processor based on LEON3-FT IP core (LEON3-FT-MP). Data Syst. Aerospace. 630, ESA Special Publication.Google Scholar
- Hennessy, J. and Patterson, D. 2006. Computer Architecture: A Quantitative Approach 4th Ed., Morgan Kaufmann. Google Scholar
Digital Library
- Hofstee, H. P. 2005. Power efficient processor architecture and the cell processor. In Proceedings of the Symposium on High Performance Computer Architecture. 258--262. Google Scholar
Digital Library
- IBM. 2007. 32-Bit OPB arbiter core databook, rev. 1.Google Scholar
- IBM. 2001. On-chip peripheral bus architecture specifications, v2.1.Google Scholar
- Joseph, M. and Pandya, P. K. 1986. Finding response times in a real-time system. Comput. J. 29, 5, 390--395.Google Scholar
Cross Ref
- Jouppi, N. P. 1990. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Proceedings of the 17th Annual International Symposium on Computer Architecture. 364--373. Google Scholar
Digital Library
- Kahle, J. A., Day, M. N., Hofstee, H. P., Johns, C. R., Maeurer, T. R., and Shippy, D. 2005. Introduction to the cell multiprocessor. J-IBM-JRD 49, 4/5, 589--604. Google Scholar
Digital Library
- Keltcher, C. N., McGrath, K. J., Ahmed, A., and Conway, P. 2003. The AMD Opteron processor for multiprocessor servers. IEEE Micro 23, 2, 66--76. Google Scholar
Digital Library
- Kistler, M., Perrone, M., and Petrini, F. 2006. Cell multiprocessor communication network: Built for speed. IEEE Micro 26, 10--25. Google Scholar
Digital Library
- Kongetira, P., Aingaran, K., and Olukotun, K. 2005. Niagara: A 32-way multithreaded SPARC processor. IEEE Micro 25, 2, 21--29. Google Scholar
Digital Library
- Kopetz, H. 1997. Real-Time Systems: Design Principles for Distributed Embedded Applications. Kluwer Academic Press, Amsterdam. Google Scholar
Digital Library
- Laudon, J. and Spracklen, L. 2007. The coming wave of multithreaded chip multiprocessors. Int. J. Paral. Program. 35, 3, 299--330. Google Scholar
Digital Library
- Li, Y.-T. S. and Malik, S. 1995. Performance analysis of embedded software using implicit path enumeration. In Proceedings of the Workshop on Languages, Compilers, & Tools for Real-Time Systems. 88--98. Google Scholar
Digital Library
- Lickly, B., Liu, I., Kim, S., Patel, H. D., Edwards, S. A., and Lee, E. A. 2008. Predictable programming on a precision timed architecture. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis from Embedded Systems. Google Scholar
Digital Library
- Lindholm, T. and Yellin, F. 1999. The Java Virtual Machine Specification 2nd Ed., Addison-Wesley, Reading, MA. Google Scholar
Digital Library
- Liu, C. L. and Layland, J. W. 1973. Scheduling algorithms for multiprogramming in a hard-real-time environment. J. ACM 20, 1, 46--61. Google Scholar
Digital Library
- Martin, G. and Chang, H. 2003. Winning the SOC Revolution. (Kluwer Academic Press, Amsterdam, chapter 5).Google Scholar
- Moore, G. E. 1965. Cramming more components onto integrated circuits. Electronics 38, 8, 114--117.Google Scholar
- Pitter, C. 2009. Time-predictable Java chip-multiprocessor. Ph.D. dissertation, Vienna University of Technology, Austria.Google Scholar
- Pitter, C. 2008. Time-predictable memory arbitration for a Java chip-multiprocessor. In Proceedings of the 6th International Workshop on Java Technologies for Real-Time and Embedded Systems (JTRES'08). ACM, New York. Google Scholar
Digital Library
- Pitter, C. and Schoeberl, M. 2008. Performance evaluation of a Java chip-multiprocessor. In Proceedings of the IEEE 3rd Symposium on Industrial Embedded Systems (SIES'08). IEEE, Los Alamitos, CA.Google Scholar
- Pitter, C. and Schoeberl, M. 2007a. Time predictable CPU and DMA shared memory access. In Proceedings of the International Conference on Field-Programmable Logic and its Applications (FPL'07).Google Scholar
- Pitter, C. and Schoeberl, M. 2007b. Towards a Java multiprocessor. In Proceedings of the 5th International Workshop on Java Technologies for Real-Time and Embedded Systems (JTRES'07). ACM, New York. Google Scholar
Digital Library
- Poletti, F., Bertozzi, D., Benini, L., and Bogliolo, A. 2003. Performance analysis of arbitration policies for SoC communication architectures. Des. Automation Embed. Syst. 8, 189--210.Google Scholar
Digital Library
- Puschner, P. and Burns, A. 2000. A review of worst-case execution-time analysis. J. Real-Time Syst. 18, 2/3, 115--128. Google Scholar
Digital Library
- Rosen, J., Andrei, A., Eles, P., and Peng, Z. 2007. Bus access optimization for predictable implementation of real-time applications on multiprocessor systems-on-chip. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS). IEEE, Los Alamitos, 49--60. Google Scholar
Digital Library
- Schoeberl, M. 2008. A Java processor architecture for embedded real-time systems. J. Syst. Architecture 54/1--2, 265--286. Google Scholar
Digital Library
- Schoeberl, M. 2007. SimpCon—A simple and efficient SoC interconnect. In Proceedings of the 15th Austrian Workshop on Microelectronics (Austrochip'07).Google Scholar
- Schoeberl, M. and Pedersen, R. 2006. WCET analysis for a Java processor. In Proceedings of the 4th International Workshop on Java Technologies for Real-time and Embedded Systems (JTRES'06), ACM, New York, 202--211. Google Scholar
Digital Library
- Schoeberl, M. 2005a. Design and implementation of an efficient stack machine. In Proceedings of the 12th IEEE Reconfigurable Architecture Workshop (RAW'05), IEEE, Los Alamitos. Google Scholar
Digital Library
- Schoeberl, M. 2005b. Jop: A Java optimized processor for embedded real-time systems. Ph.D. dissertation, Vienna University of Technology, Austria.Google Scholar
- Schoeberl, M. 2004. A time predictable instruction cache for a Java processor. In Proceedings of the Workshop on Java Technologies for Real-Time and Embedded Systems. Lecture Notes in Computer Science, vol. 3292, Springer, Berlin, 371--382.Google Scholar
- Siebert, F. 2008. Jeopard: Java environment for parallel real-time development. In Proceedings of the 6th International Workshop on Java Technologies for Real-Time and Embedded Systems (JTRES '08), ACM, New York, 87--93. Google Scholar
Digital Library
- SPARC International Inc. 1992. The SPARC Architecture Manual: Version 8. Prentice Hall, Englewood Cliffs, NJ. Google Scholar
Digital Library
- Thiele, L. and Wilhelm, R. 2004. Design for timing predictability. Real-Time Syst, 28, 2--3, 157--177. Google Scholar
Digital Library
- Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D. B., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P. P., Staschulat, J., and Stenström, P. 2008. The worst-case execution-time problem—Overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst 7, 3, 1--53. Google Scholar
Digital Library
- Wolf, W. 2006. High-Performance Embedded Computing: Architectures, Applications, and Methodologies. Morgan Kaufmann, San Francisco, CA. Google Scholar
Digital Library
- Xilinx. 2007. MicroBlaze Processor Reference Guide, Embedded Development Kit EDK 9.2i. http://www.xilinx.com.Google Scholar
- Xilinx. 2005. OPB Arbiter product specification (v1.10c).Google Scholar
Index Terms
A real-time Java chip-multiprocessor
Recommendations
Time-predictable memory arbitration for a Java chip-multiprocessor
JTRES '08: Proceedings of the 6th international workshop on Java technologies for real-time and embedded systemsIn this paper, we propose an approach to calculate worst-case execution times (WCET) of tasks running on a homogeneous Java multiprocessor. These processors access a shared main memory. Hence, the tasks running on different CPUs may influence the ...
Towards a Java multiprocessor
JTRES '07: Proceedings of the 5th international workshop on Java technologies for real-time and embedded systemsThis paper describes the first steps towards a Java multiprocessor system on a single chip for embedded systems. The chip multiprocessing (CMP) system consists of a homogeneous set of processing elements and a shared memory. Each processor core is based ...
A Low-power Low-cost Optical Router for Optical Networks-on-Chip in Multiprocessor Systems-on-Chip
ISVLSI '09: Proceedings of the 2009 IEEE Computer Society Annual Symposium on VLSINetworks-on-chip (NoCs) can improve the communication bandwidth and power efficiency of multiprocessor systems-on-chip (MPSoC). However, traditional metallic interconnects consume significant amount of power to deliver even higher communication ...






Comments