skip to main content
10.1145/1806596.1806617acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Finding low-utility data structures

Published:05 June 2010Publication History

ABSTRACT

Many opportunities for easy, big-win, program optimizations are missed by compilers. This is especially true in highly layered Java applications. Often at the heart of these missed optimization opportunities lie computations that, with great expense, produce data values that have little impact on the program's final output. Constructing a new date formatter to format every date, or populating a large set full of expensively constructed structures only to check its size: these involve costs that are out of line with the benefits gained. This disparity between the formation costs and accrued benefits of data structures is at the heart of much runtime bloat.

We introduce a run-time analysis to discover these low-utility data structures. The analysis employs dynamic thin slicing, which naturally associates costs with value flows rather than raw data flows. It constructs a model of the incremental, hop-to-hop, costs and benefits of each data structure. The analysis then identifies suspicious structures based on imbalances of its incremental costs and benefits. To decrease the memory requirements of slicing, we introduce abstract dynamic thin slicing, which performs thin slicing over bounded abstract domains. We have modified the IBM J9 commercial JVM to implement this approach.

We demonstrate two client analyses: one that finds objects that are expensive to construct but are not necessary for the forward execution, and second that pinpoints ultimately-dead values. We have successfully applied them to large-scale and long-running Java applications. We show that these analyses are effective at detecting operations that have unbalanced costs and benefits.

References

  1. H. Agrawal and J. R. Horgan. Dynamic program slicing. In PLDI, pages 246--256, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Arnold, M. Vechev, and E. Yahav. QVM: An efficient runtime for detecting defects in deployed systems. In OOPSLA, pages 143--162, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Ball and J. Larus. Efficient path profiling. In MICRO, pages 46--57, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B.Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA, pages 169--190, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. D. Bond and K. S. McKinley. Bell: Bit-encoding online memory leak detection. In ASPLOS, pages 61--72, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. D. Bond and K. S. McKinley. Probabilistic calling context. In OOPSLA, pages 97--112, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. D. Bond and K. S. McKinley. Tolerating memory leaks. In OOPSLA, pages 109--126, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. D. Bond and K. S. McKinley. Leak pruning. In ASPLOS, pages 277--288, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. D. Bond, N. Nethercote, S. W. Kent, S. Z. Guyer, and K. S. McKinley. Tracking bad apples: Reporting the origin of null and undefined value errors. In OOPSLA, pages 405--422, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Calder, P. Feller, and A. Eustace. Value profiling. In MICRO, pages 259--269, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. B. Dufour, B. G. Ryder, and G. Sevitsky. A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications. In FSE, pages 59--70, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. F. Goldsmith, A. S. Aiken, and D. S. Wilkerson. Measuring empirical computational complexity. In FSE, pages 395--404, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Jump and K. S. McKinley. Cork: Dynamic memory leak detection for garbage-collected languages. In POPL, pages 31--38, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Korel and J. Laski. Dynamic slicing of computer programs. J. Syst. Softw., 13(3):187--195, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Larus. Whole program paths. In PLDI, pages 259--269, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Larus. Spending Moore's dividend. Commun. ACM, 52(5):62--69, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Milanova, A. Rountev, and B. G. Ryder. Parameterized object sensitivity for points-to analysis for Java. TOSEM, 14(1):1--41, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Mitchell, E. Schonberg, and G. Sevitsky. Making sense of large heaps. In ECOOP, pages 77--97, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. Mitchell, E. Schonberg, and G. Sevitsky. Four trends leading to Java runtime bloat. In IEEE Software, 27(1):56--63, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. N. Mitchell and G. Sevitsky. Leakbot: An automated and lightweight tool for diagnosing memory leaks in large Java applications. In ECOOP, pages 351--377, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  21. N. Mitchell and G. Sevitsky. The causes of bloat, the limits of health. OOPSLA, pages 245--260, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. Mitchell, G. Sevitsky, and H. Srinivasan. Modeling runtime behavior in framework-based applications. In ECOOP, pages 429--451, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. V. Nagarajan and R. Gupta. Architectural support for shadow memory in multiprocessors. In VEE, pages 1--10, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. Nethercote and J. Seward. How to shadow every byte of memory used by a program. In VEE, pages 65--74, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Newsome and D. Song. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In NDSS, 2005.Google ScholarGoogle Scholar
  26. G. Novark, E. D. Berger, and B. G. Zorn. Efficiently and precisely locating memory leaks and bloat. In PLDI, pages 397--407, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. F. Qin, C. Wang, Z. Li, H. Kim, Y. Zhou, and Y. Wu. Lift: A low-overhead practical information flow tracking system for detecting security attacks. In MICRO, pages 135--148, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. O. Shacham,M. Vechev, and E. Yahav. Chameleon: Adaptive selection of collections. In PLDI, pages 408--418, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Shankar, M. Arnold, and R. Bodik. JOLT: Lightweight dynamic analysis and removal of object churn. In OOPSLA, pages 127--142, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Sridharan, S. J. Fink, and R. Bodik. Thin slicing. In PLDI, pages 112--122, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. N. R. Tallent, J. M. Mellor-Crummey, and M. W. Fagan. Binary analysis for measurement and attribution of program performance. In PLDI, pages 441--452, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. F. Tip. A survey of program slicing techniques. Journal of Programming Languages, 3:121--189, 1995.Google ScholarGoogle Scholar
  33. C. Wang and A. Roychoudhury. Dynamic slicing on Java bytecode traces. ACM Transactions on Programming Languages and Systems, 30(2):1--49, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. G. Xu, M. Arnold, N. Mitchell, A. Rountev, and G. Sevitsky. Go with the flow: Profiling copies to find runtime bloat. In PLDI, pages 419--430, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Xu and A. Rountev. Precise memory leak detection for Java software using container profiling. In ICSE, pages 151--160, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. G. Xu and A. Rountev. Detecting inefficiently-used containers to avoid bloat. In PLDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. X. Zhang, N. Gupta, and R. Gupta. Pruning dynamic slices with confidence. In PLDI, pages 169--180, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. X. Zhang and R. Gupta. Cost effective dynamic program slicing. In PLDI, pages 94--106, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. X. Zhang and R. Gupta. Whole execution traces. In MICRO, pages 105--116, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. X. Zhang, R. Gupta, and Y. Zhang. Precise dynamic slicing algorithms. In ICSE, pages 319--329, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. X. Zhang, S. Tallam, and R. Gupta. Dynamic slicing long running programs through execution fast forwarding. In FSE, pages 81--91, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. X. Zhuang, M. J. Serrano, H. W. Cain, and J.-D. Choi. Accurate, efficient, and adaptive calling context profiling. In PLDI, pages 263--271, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Finding low-utility data structures

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            PLDI '10: Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation
            June 2010
            514 pages
            ISBN:9781450300193
            DOI:10.1145/1806596
            • cover image ACM SIGPLAN Notices
              ACM SIGPLAN Notices  Volume 45, Issue 6
              PLDI '10
              June 2010
              496 pages
              ISSN:0362-1340
              EISSN:1558-1160
              DOI:10.1145/1809028
              Issue’s Table of Contents

            Copyright © 2010 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 5 June 2010

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate406of2,067submissions,20%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!