skip to main content
research-article

DDGacc: boosting dynamic DDG-based binary optimizations through specialized hardware support

Authors Info & Claims
Published:03 March 2012Publication History
Skip Abstract Section

Abstract

Dynamic Binary Translators (DBT) and Dynamic Binary Optimization (DBO) by software are used widely for several reasons including performance, design simplification and virtualization. However, the software layer in such systems introduces non-negligible overheads which affect performance and user experience. Hence, reducing DBT/DBO overheads is of paramount importance. In addition, reduced overheads have interesting collateral effects in the rest of the software layer, such as allowing optimizations to be applied earlier. A cost-effective solution to this problem is to provide hardware support to speed up the primitives of the software layer, paying special attention to automate DBT/DBO mechanisms and leave the heuristics to the software, which is more flexible. In this work, we have characterized the overheads of a DBO system using DynamoRIO implementing several basic optimizations. We have seen that the computation of the Data Dependence Graph (DDG) accounts for 5%-10% of the execution time. For this reason, we propose to add hardware support for this task in the form of a new functional unit, called DDGacc, which is integrated in a conventional pipeline processor and is operated through new ISA instructions. Our evaluation shows that DDGacc reduces the cost of computing the DDG by 32x, which reduces overall execution time by 5%-10% on average and up to 18% for applications where the DBO optimizes large code footprints.

References

  1. Standard Performance Evaluation Corporation. SPEC CPU2006 Benchmarks. URL http://www.spec.org/cpu2006/.Google ScholarGoogle Scholar
  2. T. Austin, E. Larson, and D. Ernst. Simplescalar: an infrastructure for computer system modeling. Computer, 35 (2): 59 --67, feb 2002. ISSN 0018--9162. 10.1109/2.982917. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In PLDI '00: Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, pages 1--12, New York, NY, USA, 2000. ACM. ISBN 1--58113--199--2. http://doi.acm.org/10.1145/349299.349303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Baraz, T. Devor, O. Etzion, S. Goldenberg, A. Skaletsky, Y. Wang, and Y. Zemach. Ia-32 execution layer: a two-phase dynamic translator designed to support ia-32 applications on itanium®-based systems. In phMICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, page 191, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--2043-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Bruening, T. Garnett, and S. Amarasinghe. An infrastructure for adaptive dynamic optimization. In CGO '03: Proceedings of the international symposium on Code generation and optimization, pages 265--275, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--1913-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. C. Dehnert, B. K. Grant, J. P. Banning, R. Johnson, T. Kistler, A. Klaiber, and J. Mattson. The Transmeta Code Morphing™ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges. In CGO '03: Proceedings of the international symposium on Code generation and optimization, pages 15--24, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--1913-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Ebciouglu and E. R. Altman. Daisy: dynamic compilation for 100% architectural compatibility. SIGARCH Comput. Archit. News, 25 (2): 26--37, 1997. ISSN 0163--5964. http://doi.acm.org/10.1145/384286.264126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Hazelwood and M. D. Smith. Managing bounded code caches in dynamic binary optimization systems. ACM Trans. Archit. Code Optim., 3: 263--294, September 2006. ISSN 1544--3566. http://doi.acm.org/10.1145/1162690.1162692. URL http://doi.acm.org/10.1145/1162690.1162692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. D. Hiser, D. Williams, W. Hu, J. W. Davidson, J. Mars, and B. R. Childers. Evaluating Indirect Branch Handling Mechanisms in Software Dynamic Translation Systems. In CGO '07: Proceedings of the International Symposium on Code Generation and Optimization, pages 61--73, Washington, DC, USA, 2007. IEEE Computer Society. ISBN 0--7695--2764--7. http://dx.doi.org/10.1109/CGO.2007.10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Hu and J. E. Smith. Reducing startup time in co-designed virtual machines. In ISCA '06: Proceedings of the 33rd annual international symposium on Computer Architecture, pages 277--288, Washington, DC, USA, 2006. IEEE Computer Society. ISBN 0--7695--2608-X. http://dx.doi.org/10.1109/ISCA.2006.33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Klaiber. The Technology Behind the Crusoe Processors. White paper, January 2000.Google ScholarGoogle Scholar
  12. T. Lindholm and F. Yellin. Java Virtual Machine Specification. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. ISBN 0201432943. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Lu, H. Chen, R. Fu, W.-C. Hsu, B. Othmer, P.-C. Yew, and D.-Y. Chen. The performance of runtime data cache prefetching in a dynamic optimization system. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, MICRO 36, pages 180--, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--2043-X. URL http://dl.acm.org/citation.cfm?id=956417.956549. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. F. Martínez, J. Renau, M. C. Huang, M. Prvulovic, and J. Torrellas. Cherry: checkpointed early resource recycling in out-of-order microprocessors. In MICRO 35: Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, pages 3--14, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press. ISBN 0--7695--1859--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. C. Merten, A. R. Trick, E. M. Nystrom, R. D. Barnes, and W.-m. W. Hmu. A hardware mechanism for dynamic extraction and relayout of program hot spots. In Proceedings of the 27th annual international symposium on Computer architecture, ISCA '00, pages 59--70, New York, NY, USA, 2000. ACM. ISBN 1--58113--232--8. http://doi.acm.org/10.1145/339647.339655. URL http://doi.acm.org/10.1145/339647.339655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. S. Muchnick. phAdvanced compiler design and implementation. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997. ISBN 1--55860--320--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Patel and S. Lumetta. rePLay: A hardware framework for dynamic optimization. Computers, IEEE Transactions on, 50 (6): 590--608, Jun 2001. ISSN 0018--9340. 10.1109/12.931895. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. S. Paul, P. Ledak, J. Leblanc, S. Kosonocky, M. Gschwind, J. Fritts, A. Bright, E. Altman, and C. Agricola. Boa: Targeting multi-gigahertz with binary translation. In In Proc. of the 1999 Workshop on Binary Translation, IEEE Computer Society Technical Committee on Computer Architecture Newsletter, pages 2--11, 1999.Google ScholarGoogle Scholar
  19. D. Pavlou, E. Gibert, F. Latorre, and A. Gonzalez. Improving Dynamic Binary Optimizers Efficiency through Specific Hardware Support. Technical Report UPC-DAC-RR-ARCO-2009--11, Universitat Politecnica de Catalunya, Department of Computer Architecture, September 2009.Google ScholarGoogle Scholar
  20. R. Rosner, Y. Almog, M. Moffie, N. Schwartz, and A. Mendelson. Power awareness through selective dynamically optimized traces. In Computer Architecture, 2004. Proceedings. 31st Annual International Symposium on, pages 162--173, June 2004. 10.1109/ISCA.2004.1310772. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. Scott, N. Kumar, S. Velusamy, B. Childers, J. W. Davidson, and M. L. Soffa. Retargetable and reconfigurable software dynamic translation. In CGO '03: Proceedings of the International Symposium on Code Generation and Optimization, pages 36--47, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--1913-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Smith and R. Nair. Virtual Machines: Versatile Platforms for Systems and Processes (The Morgan Kaufmann Series in Computer Architecture and Design). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2005. ISBN 1558609105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. W. Srisa-an, M. B. Cohen, Y. Shang, and M. Soundararaj. A self-adjusting code cache manager to balance start-up time and memory usage. In Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization, CGO '10, pages 82--91, New York, NY, USA, 2010. ACM. ISBN 978--1--60558--635--9. http://doi.acm.org/10.1145/1772954.1772968. URL http://doi.acm.org/10.1145/1772954.1772968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Wilton and N. Jouppi. Cacti: an enhanced cache access and cycle time model. Solid-State Circuits, IEEE Journal of, 31 (5): 677--688, May 1996. ISSN 0018--9200. 10.1109/4.509850.Google ScholarGoogle Scholar
  25. W. Zhang, B. Calder, and D. M. Tullsen. An event-driven multithreaded dynamic optimization framework. In Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, PACT '05, pages 87--98, Washington, DC, USA, 2005. IEEE Computer Society. ISBN 0--7695--2429-X. http://dx.doi.org/10.1109/PACT.2005.7. URL http://dx.doi.org/10.1109/PACT.2005.7. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DDGacc: boosting dynamic DDG-based binary optimizations through specialized hardware support

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!