skip to main content
research-article

Finding reusable data structures

Published:19 October 2012Publication History
Skip Abstract Section

Abstract

A big source of run-time performance problems in large-scale, object-oriented applications is the frequent creation of data structures (by the same allocation site) whose lifetimes are disjoint, and whose shapes and data content are always the same. Constructing these data structures and computing the same data values many times is expensive; significant performance improvements can be achieved by reusing their instances, shapes, and/or data values rather than reconstructing them. This paper presents a run-time technique that can be used to help programmers find allocation sites that create such data structures to improve performance. At the heart of the technique are three reusability definitions and novel summarization approaches that compute summaries for data structures based on these definitions. The computed summaries are used subsequently to find data structures that have disjoint lifetimes, and/or that have the same shapes and content. We have implemented this technique in the Jikes RVM and performed extensive studies on large-scale, real-world programs. We describe our experience using six case studies, in which we have achieved large performance gains by fixing problems reported by our tool.

References

  1. E. E. Aftandilian and S. Z. Guyer. GC assertions: Using the garbage collector to check heap properties. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 235--244, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. H. Agrawal and J. R. Horgan. Dynamic program slicing. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 246--256, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Altman, M. Arnold, S. Fink, and N. Mitchell. Performance analysis of idle programs. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 739--753, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Arnold, M. Vechev, and E. Yahav. QVM: An efficient runtime for detecting defects in deployed systems. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 143--162, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Benoit, E. D. Demaine, J. I. Munro, R. Raman, V. Raman, and S. S. Rao. Representing trees of higher degree. phAlgorithmica, 43: 275--292, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Bhattacharya, M. Nanda, K. Gopinath, and M. Gupta. Reuse, recycle to de-bloat software. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 408--432, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. M. Blackburn and K. S. McKinley. Immix: a mark-region garbage collector with space efficiency, fast collection, and mutator performance. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 22--32, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. VanDrunen, von Dincklage, and Wiedermann}dacapo-oopsla06-fullS. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 169--190, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Blanchet. Escape analysis for object-oriented languages. Applications to Java. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 20--34, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. D. Bond and K. S. McKinley. Bell: Bit-encoding online memory leak detection. In phInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 61--72, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. D. Bond and K. S. McKinley. Probabilistic calling context. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 97--112, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Calder, P. Feller, and A. Eustace. Value profiling. In phInternational Symposium on Microarchitecture (MICRO), pages 259--269, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. E. Chis, N. Mitchell, E. Schonberg, G. Sevitsky, P. O'Sullivan, T. Parsons, and J. Murphy. Patterns of memory inefficiency. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 383--407, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Choi, M. Gupta, M. Serrano, V. Sreedhar, and S. Midkiff. Escape analysis for Java. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 1--19, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Dufour, B. G. Ryder, and G. Sevitsky. A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications. In phACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), pages 59--70, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Gay and B. Steensgaard. Fast escape analysis and stack allocation for object-based programs. In phInternational Conference on Compiler Construction (CC), LNCS 1781, pages 82--93, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. F. Geary, R. Raman, and V. Raman. Succinct ordinal trees with level-ancestor queries. In phACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1--10, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. O. Gheorghioiu, A. Salcianu, and M. Rinard. Interprocedural compatibility analysis for static object preallocation. In phACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 273--284, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. hertz-toplas06M. Hertz, S. M. Blackburn, J. E. B. Moss, K. S. McKinley, and D. Stefanović. Generating object lifetime traces with Merlin. phACM Transactions on Programming Languages and Systems, 28 (3): 476--516, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Jansson, K. Sadakane, and W.-K. Sung. Ultra-succinct representation of ordered trees. In phACM-SIAM Symposium on Discrete Algorithms (SODA), pages 575--584, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. E. Jones and C. Ryder. A study of Java object demographics. In phInternational Symposium on Memory Management (ISMM), pages 121--130, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Jump and K. S. McKinley. Cork: Dynamic memory leak detection for garbage-collected languages. In phACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 31--38, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Marinov and R. O'Callahan. Object equality profiling. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 313--325, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. Mitchell. The runtime structure of object ownership. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 74--98, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. N. Mitchell and G. Sevitsky. The causes of bloat, the limits of health. phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 245--260, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. N. Mitchell, G. Sevitsky, and H. Srinivasan. Modeling runtime behavior in framework-based applications. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 429--451, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. Mitchell, E. Schonberg, and G. Sevitsky. Making sense of large heaps. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 77--97, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. N. Mitchell, E. Schonberg, and G. Sevitsky. Four trends leading to Java runtime bloat. phIEEE Software, 27 (1): 56--63, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Munro and V. Raman. Succinct representation of balanced parentheses, static trees and planar graphs. In phIEEE Symposium on Foundations of Computer Science (FOCS), pages 118--126, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. I. Munro and V. Raman. Succinct representation of balanced parentheses and static trees. phSIAM J. Comput., 31 (3): 762--776, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Reichenbach, N. Immerman, Y. Smaragdakis, E. Aftandilian, and S. Z. Guyer. What can the GC compute efficiently? A language for heap assertions at GC time. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 256--269, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C. Ruggieri and T. P. Murtagh. Lifetime analysis of dynamically allocated objects. In phACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 285--293, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. B. Sartor, M. Hirzel, and K. S. McKinley. No bit left behind: the limits of heap data compression. In phInternational Symposium on Memory Management (ISMM), pages 111--120, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. B. Sartor, S. M. Blackburn, D. Frampton, M. Hirzel, and K. S. McKinley. Z-rays: divide arrays and conquer speed and flexibility. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 471--482, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. O. Shacham, M. Vechev, and E. Yahav. Chameleon: Adaptive selection of collections. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 408--418, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Shankar, M. Arnold, and R. Bodik. JOLT: Lightweight dynamic analysis and removal of object churn. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 127--142, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Vechev, E. Yahav, and G. Yorsh. PHALANX: Parallel checking of expressive heap assertions. In phInternational Symposium on Memory Management (ISMM), pages 41--50, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. Whaley and M. Rinard. Compositional pointer and escape analysis for Java programs. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 187--206, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. G. Xu. phAnalyzing Large-Scale Object-Oriented Software to Find and Remove Runtime Bloat. PhD thesis, The Ohio State University, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. G. Xu and A. Rountev. Precise memory leak detection for Java software using container profiling. In phInternational Conference on Software Engineering (ICSE), pages 151--160, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. G. Xu and A. Rountev. Detecting inefficiently-used containers to avoid bloat. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 160--173, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. G. Xu, M. Arnold, N. Mitchell, A. Rountev, and G. Sevitsky. Go with the flow: Profiling copies to find runtime bloat. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 419--430, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Xu, Arnold, Mitchell, Rountev, Schonberg, and Sevitsky}xu-pldi10-aG. Xu, M. Arnold, N. Mitchell, A. Rountev, E. Schonberg, and G. Sevitsky. Finding low-utility data structures. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 174--186, 2010\natexlaba. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xu, Mitchell, Arnold, Rountev, and Sevitsky}xu-foser10G. Xu, N. Mitchell, M. Arnold, A. Rountev, and G. Sevitsky. Software bloat analysis: Finding, removing, and preventing performance problems in modern large-scale object-oriented applications. In phFSE/SDP Working Conference on the Future of Software Engineering Research (FoSER), pages 421--426, 2010\natexlabb. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. G. Xu, M. D. Bond, F. Qin, and A. Rountev. Leakchaser: Helping programmers narrow down causes of memory leaks. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 270--282, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. G. Xu, D. Yan, and A. Rountev. Static detection of loop-invariant data structures. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 738--763, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. D. Yan, G. Xu, and A. Rountev. Uncovering performance problems in Java applications with reference propagation profiling. In phInternational Conference on Software Engineering (ICSE), pages 134--144, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. X. Zhang. phFault Localization via Precise Dynamic Slicing. PhD thesis, University of Arizona, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. X. Zhang and R. Gupta. Cost effective dynamic program slicing. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 94--106, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. X. Zhang, R. Gupta, and Y. Zhang. Precise dynamic slicing algorithms. In phInternational Conference on Software Engineering (ICSE), pages 319--329, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Finding reusable data structures

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 47, Issue 10
          OOPSLA '12
          October 2012
          1011 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2398857
          Issue’s Table of Contents
          • cover image ACM Conferences
            OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applications
            October 2012
            1052 pages
            ISBN:9781450315616
            DOI:10.1145/2384616

          Copyright © 2012 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 October 2012

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!