skip to main content
research-article

Portable mapping of openMP to multicore embedded systems using MCA APIs

Published:20 June 2013Publication History
Skip Abstract Section

Abstract

Multicore embedded systems are being widely used in telecommunication systems, robotics, medical applications and more.While they offer a high-performance with low-power solution, programming in an efficient way is still a challenge. In order to exploit the capabilities that the hardware offers, software developers are expected to handle many of the low-level details of programming including utilizing DMA, ensuring cache coherency, and inserting synchronization primitives explicitly. The state-of-the-art involves solutions where the software toolchain is too vendor-specific thus tying the software to a particular hardware leaving no room-for portability.

In this paper we present a runtime system to explore mapping a high-level programming model, OpenMP, on to multicore embedded systems. A key feature of our scheme is that unlike the existing approaches that largely rely on POSIX threads, our approach leverages the Multicore Association (MCA) APIs as an OpenMP translation layer. The MCA APIs is a set of low-level APIs handling resource management, inter-process communications and task scheduling for multicore embedded systems. By deploying the MCA APIs, our runtime is able to effectively capture the characteristics of multicore embedded systems compared with the POSIX threads. Furthermore, the MCA layer enables our runtime implementation to be portable across various architectures. Thus programmers only need to maintain a single OpenMP code base which is compatible by various compilers, while on the other hand, the code is portable across different possible types of platforms. We have evaluated our runtime system using several embedded benchmarks. The experiments demonstrate promising and competitive performance compared to the native approach for the platform.

References

  1. TMDXEVM6678L EVM Technical Reference Manual Version 1.0, Literature Number: SPRUH58. URL http://wfcache.advantech.com.Google ScholarGoogle Scholar
  2. Data Communication and Synchronization Library for Cell Broadband Engine Programmers Guide and API reference, Version 3.0. URL http://moss.csc.ncsu.edu/~mueller/cluster/ps3/SDK3.0/docs.Google ScholarGoogle Scholar
  3. Freescale Semiconductor Inc. URL http://www.freescale.com.Google ScholarGoogle Scholar
  4. The Multicore Association. URL http://www.multicore-association.org.Google ScholarGoogle Scholar
  5. A Case For MCAPI: CPU-to-CPU Communications in Multicore Designs. URL http://www.mentor.com/.Google ScholarGoogle Scholar
  6. Multicore Resource API (MRAPI) Specification, Version 1.0. URL http://www.multicore-association.org.Google ScholarGoogle Scholar
  7. The Objective-C Programming Languages. URL http://developer.apple.com.Google ScholarGoogle Scholar
  8. The OpenCL Specification, Version 1.0, . URL http://www.khronos.org.Google ScholarGoogle Scholar
  9. OpenMP Application Program Interface, Version 3.1, . URL http://www.openmp.org.Google ScholarGoogle Scholar
  10. Polycore MCAPI Offers ThreadX RTOS Support. URL http://www.eetasia.com.Google ScholarGoogle Scholar
  11. J. Auerbach, D. F. Bacon, I. Burcea, P. Cheng, S. J. Fink, R. Rabbah, and S. Shukla. A Compiler and Runtime for Heterogeneous Computing. In Proceedings of DAC?12, pages 271--276, NY, USA, 2012. ACM. ISBN 978-1-4503-1199-1. doi: 10.1145/2228360.2228411. URL http://doi.acm.org/10.1145/2228360.2228411. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Bull. Measuring Synchronisation and Scheduling Overheads in OpenMP. In Proceedings of the First European Workshop on OpenMP, pages 99--105, 1999.Google ScholarGoogle Scholar
  13. Q. Cao, C. Hu, H. He, X. Huang, and S. Li. Support for OpenMP Tasks on Cell Architecture. In Proc. of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II, ICA3PP?10, pages 308--317. Springer-Verlag, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Chapman, L. Huang, E. Biscondi, E. Stotzer, A. Shrivastava, and A. Gatherer. Implementing OpenMP on a High Performance Embedded Multicore MPSoC. In Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pages 1--8, 2009. doi: 10.1109/IPDPS.2009.5161107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A Benchmark Suite for Heterogeneous Computing. In Proceedings of IISWC?09, pages 44--54, Washington, DC, USA, 2009. IEEE Computer Society. ISBN 978-1-4244-5156-2. doi: 10.1109/IISWC.2009.5306797. URL http://dx.doi.org/10.1109/IISWC.2009.5306797. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Cooper, U. Dolinsky, A. F. Donaldson, A. Richards, C. Riley, and G. Russell. Offload: Automating Code Migration to Heterogeneous Multicore Systems. In Proceedings of HiPEAC ?10, pages 337--352. Springer-Verlag, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. Garcia and J. Fernandez. POSIX Threads Libraries. Linux J., 2000, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Garland, M. Kudlur, and Y. Zheng. Designing a Unified Programming Model for Heterogeneous Machines. In Proceedings of SC? 12, pages 67:1--67:11, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press. ISBN 978-1-4673-0804-5. URL http://dl.acm.org/citation.cfm?id=2388996.2389087. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Gokhale, J. Stone, J. Arnold, and M. Kalinowski. Stream-oriented FPGA Computing in the Streams-C High Level Language. In Field-Programmable Custom Computing Machines, 2000 IEEE Symposium on, pages 49--56. IEEE, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. MiBench: A Free, Commercially Representative Embedded Benchmark Suite. In Proc. of WWC-4, 2001., pages 3--14. IEEE Computer Society, 2001. Google ScholarGoogle ScholarCross RefCross Ref
  21. T. D. Han and T. S. Abdelrahman. hiCUDA: High-Level GPGPU Programming. IEEE Transactions on Parallel and Distributed Systems, 22:78--90, 2011. ISSN 1045-9219. doi: http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. Hanawa, M. Sato, J. Lee, T. Imada, H. Kimura, and T. Boku. Evaluation ofMulticore Processors for Embedded Systems by Parallel Benchmark Program Using OpenMP. Evolving OpenMP in an Age of Extreme Parallelism, pages 15--27, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. He,W. Chen, G. Chen,W. Zheng, Z. Tang, and H. Ye. OpenMDSP: Extending OpenMP to Program Multi-Core DSP. In Proceedings of PACT ?11, pages 288--297. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. F. D. Igual, M. Ali, A. Friedmann, E. Stotzer, T. Wentz, and R. A. van de Geijn. Unleashing the High-Performance and Low-Power of Multi-core DSPs for General-Purpose HPC. In Proceedings of SC ? 12, SC ?12, pages 26:1?-26:11, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press. ISBN 978-1-4673-0804-5. URL http://dl.acm.org/citation.cfm?id=2388996.2389032. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Lee and R. Eigenmann. OpenMPC: Extended OpenMP Programming and Tuning for GPUs. In Proceedings of SC ?10, pages 1--11. IEEE Computer Society, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Lee, S.-J. Min, and R. Eigenmann. OpenMP to GPGPU: A Compiler Framework for Automatic Translation and Optimization. In Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP ?09, pages 101--110, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-397-6. doi: 10.1145/1504176.1504194. URL http://doi.acm.org/10.1145/1504176.1504194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Liao, O. Hernandez, B. M. Chapman, W. Chen, and W. Zheng. OpenUH: an Optimizing, Portable OpenMP Compiler. Concurrency and Computation: Practice and Experience, 19(18):2317--2332, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. P.Martin. An Analysis of Random Number Generators for a Hardware Implementation of Genetic Programming using FPGAs and Handel-C. In Proceedings of the genetic and evolutionary computation conference, pages 837--844. Morgan Kaufmann Publishers Inc., 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. M. Mellor-Crummey and M. L. Scott. Algorithms for Scalable Synchronization on Shared-memory Multiprocessors. ACM Trans. Comput. Syst., 9(1):21--65, Feb. 1991. ISSN 0734-2071. doi: 10.1145/103727.103729. URL http://doi.acm.org/10.1145/103727.103729. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. K. O?Brien, K. O?Brien, Z. Sura, T. Chen, and T. Zhang. Supporting OpenMP on Cell. Int. J. Parallel Program., 36(3):289--311, June 2008. Google ScholarGoogle ScholarCross RefCross Ref
  31. D. Pellerin and S. Thibault. Practical FPGA Programming in C. Prentice Hall Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Reid, K. Flautner, E. Grimley-Evans, and Y. Lin. SoC-C: Efficient Programming Abstractions for Heterogeneous Multicore Systems on Chip. In Proceedings of CASES ? 08, pages 95--104. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Sato, M. S. Shigehisa, K. Kusano, and Y. Tanaka. Design of OpenMP Compiler for an SMP Cluster. In In EWOMP 99, pages 32--39, 1999.Google ScholarGoogle Scholar
  34. A. Sb??rlea, Y. Zou, Z. Budimlć, J. Cong, and V. Sarkar. Mapping a Data-flow Programming Model onto Heterogeneous Platforms. In Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems, LCTES ?12, pages 61--70, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1212-7. doi: 10.1145/2248418.2248428. URL http://doi.acm.org/10.1145/2248418.2248428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D.W.Walker, D.W.Walker, J. J. Dongarra, and J. J. Dongarra. MPI: A Standard Message Passing Interface. Supercomputer, 12:56--68, 1996.Google ScholarGoogle Scholar
  36. C. Wang, S. Chandrasekaran, B. Chapman, and J. Holt. libEOMP: A Portable OpenMP Runtime Library Based on MCA APIs for Embedded Systems. In Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM ?13, pages 83--92, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-1908-9. doi: 10.1145/2442992.2443001. URL http://doi.acm.org/10.1145/2442992.2443001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Portable mapping of openMP to multicore embedded systems using MCA APIs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 48, Issue 5
      LCTES '13
      May 2013
      165 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2499369
      Issue’s Table of Contents
      • cover image ACM Conferences
        LCTES '13: Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
        June 2013
        184 pages
        ISBN:9781450320856
        DOI:10.1145/2491899

      Copyright © 2013 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 June 2013

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!