Abstract
Within the domain of embedded systems, hardware architectures are commonly characterised by application-specific heterogeneity. Systems may contain multiple dissimilar processing elements, non-standard memory architectures, and custom hardware elements. The programming of such systems is a considerable challenge, not only because of the need to exploit large degrees of parallelism but also because hardware architectures change from system to system. To solve this problem, this paper proposes the novel combination of a new industry standard for communication across multicore architectures (MCAPI), with a minimal-overhead technique for targeting complex architectures with standard programming languages (Compile-Time Virtualisation).
The Multicore Association have proposed MCAPI as an industry standard for on-chip communications. MCAPI abstracts the on-chip physical communication to provide the application with logical point-to-point unidirectional channels between nodes (software thread, hardware core, etc.). Compile-Time Virtualisation is used to provide an extremely lightweight implementation of MCAPI, that supports a much wider range of architectures than its specification normally considers. Overall, this unique combination enhances programmability by abstracting on-chip communication whilst also exposing critical parts of the target architecture to the programming language.
- J. Agron and D. Andrews. Building heterogeneous reconfigurable systems with a hardware microkernel. In Proceedings of CODES Google Scholar
Digital Library
- ISSS '09, pages 393--402, New York, NY, USA, 2009. ACM.Google Scholar
- Baumann et al. The Multikernel: a new OS architecture for scalable multicore systems. In Proceedings of SOSP '09, pages 29--44, New York, NY, USA, 2009. ACM. Google Scholar
Digital Library
- R. Brukardt. The Ada95 language reference manual - Appendix E, Distributed Systems (International Standard ISO/IEC 8652:1995). http://www.adaic.org/standards/95lrm/html/RM-E.html.Google Scholar
- W. W. Carlson, D. E. Culler, and E. Brooks. Introduction to UPC and language specification. CCS-TR-99--157, 1999.Google Scholar
- B. Chamberlain, D. Callahan, and H. Zima. Parallel programmability and the Chapel language. Int. J. High Perform. Comput. Appl., 21(3):291--312, 2007. Google Scholar
Digital Library
- R. Chandra et al. Parallel programming in OpenMP. Morgan Kaufmann, 2001. Google Scholar
Digital Library
- P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In Proceedings of OOPSLA '05, pages 519--538, New York, NY, USA, 2005. ACM. Google Scholar
Digital Library
- CoWare, Inc. CoWare Virtual Platform - hardware/software integration and testing...without hardware. http://www.coware.com/products/virtualplatform.php (Accessed Aug 09).Google Scholar
- P. Dibble and A. Wellings. JSR-282 status report. In Proceedings of the 7th International Workshop on Java Technologies for Real-Time and Embedded Systems, ACM International Conference Proceeding Series, pages 179--182, New York, NY, USA, 2009. ACM. Google Scholar
Digital Library
- K. Fatahalian et al. Sequoia: programming the memory hierarchy. In SC '06, page 83, 2006. Google Scholar
Digital Library
- M. B. Gokhale, J. M. Stone, J. Arnold, and M. Kalinowski. Stream-oriented FPGA computing in the Streams-C high level language. In FCCM '00, 2000. Google Scholar
Digital Library
- J. Gosling and G. Bollella. The Real-Time Specification for Java. Addison-Wesley Longman Publishing Co., Inc., 2000. Google Scholar
Digital Library
- I. Gray and N. Audsley. Exposing non-standard architectures to embedded software using Compile-Time Virtualisation. International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '09), 2009. Google Scholar
Digital Library
- I. Gray and N. Audsley. Supporting islands of coherency for highly-parallel embedded architectures using Compile-Time Virtualisation. In 13th International Workshop on Software and Compilers for Embedded Systems (SCOPES), 2010. Google Scholar
Digital Library
- W. Gropp, E. Lusk, and A. Skjellum. Using MPI: portable parallel programming with the message-passing interface. MIT Press, Cambridge, MA, USA, 1994. Google Scholar
Digital Library
- M. C. Göthe, D. Wengelin, and L. Asplund. The distributed Ada run-time system DARTS. Software: Practice and Experience, 21:1249--1263, 1991. Google Scholar
Digital Library
- J. Holt. Designing an industry standard api to manage multicore system resources. http://www.multicore-association.org/webinar/090811_MRAPI.pdf, August 2009.Google Scholar
- Institute of Electrical and Electronics Engineers. POSIX.1c, threads extensions (IEEE Std 1003.1c-1995), 1995.Google Scholar
- R. Klefstad, M. Deshpande, C. O?Ryan, A. Corsaro, A. S. Krishna, S. Rao, and K. Raman. The performance of ZEN: A real time CORBA ORB using real time java. In Proceedings of Real-time and Embedded Distributed Object Computing Workshop. OMG, September 2002.Google Scholar
- J. Maloy. TIPC: Providing communication for linux clusters. In Proceedings of the Linux Symposium - Volume 2, pages 347--356, 2004.Google Scholar
- A. Munshi, editor. The OpenCL Specification. Khronos OpenCL Working Group, 2008.Google Scholar
- A. L. Pope. The CORBA reference guide: understanding the Common Object Request Broker Architecture. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1998. Google Scholar
Digital Library
- J. Reinders. Intel Threading Building Blocks. O'Reilly & Associates, Inc., Sebastopol, CA, USA, 2007. Google Scholar
Digital Library
- S. Sharma, G. Gopalakrishnan, E. Mercer, and J. Holt. Mcc - a runtime verification tool for mcapi user applications. In Proceedings of Formal Methods in Computer Aided Design 2009 (FMCAD09), 2009.Google Scholar
Cross Ref
- The Multicore Association. Multicore communications API specification V1.063 (MCAPI). http://www.multicore-association.org/workgroup/mcapi.php, March 2008.Google Scholar
- W. Thies et al. StreamIt: A compiler for streaming applications, December 2001. MIT-LCS Technical Memo TM-622, Cambridge, MA.Google Scholar
- Xilinx Corporation. Xilkernel. http://www.xilinx.com/ise/embedded/edk91i_docs/ξlkernel_v3_00_a.pdf, December 2006.Google Scholar
- Xilinx Corporation. Microblaze processor reference guide. UG081 v9.0, 2008.Google Scholar
Index Terms
Targeting complex embedded architectures by combining the multicore communications API (mcapi) with compile-time virtualisation
Recommendations
Targeting complex embedded architectures by combining the multicore communications API (mcapi) with compile-time virtualisation
LCTES '11: Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systemsWithin the domain of embedded systems, hardware architectures are commonly characterised by application-specific heterogeneity. Systems may contain multiple dissimilar processing elements, non-standard memory architectures, and custom hardware elements. ...
Massively LDPC Decoding on Multicore Architectures
Unlike usual VLSI approaches necessary for the computation of intensive Low-Density Parity-Check (LDPC) code decoders, this paper presents flexible software-based LDPC decoders. Algorithms and data structures suitable for parallel computing are proposed ...
A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThree out of the top four supercomputers in the November 2010 TOP500 list of the world's most powerful supercomputers use NVIDIA GPUs to accelerate computations. Ninety-five systems from the list are using processors with six or more cores. Three-...







Comments