skip to main content
10.1145/2446920.2446924acmotherconferencesArticle/Chapter ViewAbstractPublication PagescosmicConference Proceedingsconference-collections
research-article

Fast dynamic binary rewriting to support thread migration in shared-ISA asymmetric multicores

Published:24 February 2013Publication History

ABSTRACT

Asymmetric multicore processors have demonstrated a strong potential for improving performance and energy-efficiency. Shared-ISA asymmetric multicore processors overcome programmability problems in disjoint-ISA systems and enhance single-ISA architectures with instruction based asymmetry. In such a design, processors share a common, baseline ISA and performance enhanced (PE) cores extend the baseline ISA with instructions that accelerate performance-critical operations. To exploit asymmetry, the scheduler should be able to migrate threads based on their acceleration potential.

The contribution of this paper is a low overhead binary code rewriting method for shared-ISA multicore processors that transforms a binary executable at runtime, according to the scheduled processor's PE capabilities. The mutable binary code can be re-targeted among heterogeneous cores at any point in execution while preserving functional equivalence and using PE instructions, transparently, when available, thus enabling migrations among heterogeneous cores. We emulate a realistic shared-ISA asymmetric multicore system using actual hardware -- an FPGA experimental prototype. Experimental analysis shows that dynamic binary rewriting is feasible with little overhead. Rewritten code speeds up successfully baseline code while performing close, with 70% average efficiency, to non-portable, compiler generated code, statically optimized to use PE instructions.

References

  1. V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In PLDI, pages 1--12, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In IISWC, pages 44--54, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. DeVuyst, A. Venkat, and D. M. Tullsen. Execution migration in a heterogeneous-isa chip multiprocessor. In Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '12, pages 261--272, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. C. Harris and B. P. Miller. Practical analysis of stripped binary code. SIGARCH Comput. Archit. News, 33: 63--68, December 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. L. Henning. SPEC CPU2006 benchmark descriptions. SIGARCH Comput. Archit. News, 34: 1--17, Sept. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. A. Koufaty, D. Reddy, and S. Hahn. Bias scheduling in heterogeneous multi-core architectures. In EuroSys, pages 125--138, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Li, P. Brett, R. C. Knauerhase, D. A. Koufaty, D. Reddy, and S. Hahn. Operating system support for overlapping-isa heterogeneous multi-core architectures. In HPCA, pages 1--12, 2010.Google ScholarGoogle Scholar
  8. J. Lu, H. Chen, P.-C. Yew, and W. chung Hsu. Design and implementation of a lightweight dynamic optimization system. Journal of Instruction-Level Parallelism, 6: 2004, 2004.Google ScholarGoogle Scholar
  9. R. Muth, S. K. Debray, S. Watterson, and K. De Bosschere. alto: a link-time optimizer for the compaq alpha. Software: Practice and Experience, 31(1): 67--101, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. B. Nightingale, O. Hodson, R. McIlroy, C. Hawblitzel, and G. C. Hunt. Helios: heterogeneous multiprocessing with satellite kernels. In SOSP, pages 221--234, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Reddy, D. A. Koufaty, P. Brett, and S. Hahn. Bridging functional heterogeneity in multicore architectures. Operating Systems Review, 45(1): 21--33, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. C. Saez, D. Shelepov, A. Fedorova, and M. Prieto. Leveraging workload diversity through os scheduling to maximize performance on single-isa heterogeneous multicore systems. J. Parallel Distrib. Comput., 71(1): 114--131, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Shelepov, J. C. Saez, S. Jeffery, A. Fedorova, N. Perez, Z. F. Huang, S. Blagodurov, and V. Kumar. Hass: a scheduler for heterogeneous multicore systems. Operating Systems Review, 43(2): 66--75, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Van Put, D. Chanet, B. De Bus, B. De Sutter, and K. De Bosschere. DIABLO: a reliable, retargetable and extensible link-time rewriting framework. In International Symposium on Signal Processing and Information Technology, pages 7--12, 2005.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Fast dynamic binary rewriting to support thread migration in shared-ISA asymmetric multicores

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              COSMIC '13: Proceedings of the First International Workshop on Code OptimiSation for MultI and many Cores
              February 2013
              34 pages
              ISBN:9781450319713
              DOI:10.1145/2446920

              Copyright © 2013 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 24 February 2013

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader