Abstract
A big source of run-time performance problems in large-scale, object-oriented applications is the frequent creation of data structures (by the same allocation site) whose lifetimes are disjoint, and whose shapes and data content are always the same. Constructing these data structures and computing the same data values many times is expensive; significant performance improvements can be achieved by reusing their instances, shapes, and/or data values rather than reconstructing them. This paper presents a run-time technique that can be used to help programmers find allocation sites that create such data structures to improve performance. At the heart of the technique are three reusability definitions and novel summarization approaches that compute summaries for data structures based on these definitions. The computed summaries are used subsequently to find data structures that have disjoint lifetimes, and/or that have the same shapes and content. We have implemented this technique in the Jikes RVM and performed extensive studies on large-scale, real-world programs. We describe our experience using six case studies, in which we have achieved large performance gains by fixing problems reported by our tool.
- E. E. Aftandilian and S. Z. Guyer. GC assertions: Using the garbage collector to check heap properties. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 235--244, 2009. Google Scholar
Digital Library
- H. Agrawal and J. R. Horgan. Dynamic program slicing. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 246--256, 1990. Google Scholar
Digital Library
- E. Altman, M. Arnold, S. Fink, and N. Mitchell. Performance analysis of idle programs. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 739--753, 2010. Google Scholar
Digital Library
- M. Arnold, M. Vechev, and E. Yahav. QVM: An efficient runtime for detecting defects in deployed systems. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 143--162, 2008. Google Scholar
Digital Library
- D. Benoit, E. D. Demaine, J. I. Munro, R. Raman, V. Raman, and S. S. Rao. Representing trees of higher degree. phAlgorithmica, 43: 275--292, 2005. Google Scholar
Digital Library
- S. Bhattacharya, M. Nanda, K. Gopinath, and M. Gupta. Reuse, recycle to de-bloat software. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 408--432, 2011. Google Scholar
Digital Library
- S. M. Blackburn and K. S. McKinley. Immix: a mark-region garbage collector with space efficiency, fast collection, and mutator performance. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 22--32, 2008. Google Scholar
Digital Library
- VanDrunen, von Dincklage, and Wiedermann}dacapo-oopsla06-fullS. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 169--190, 2006. Google Scholar
Digital Library
- B. Blanchet. Escape analysis for object-oriented languages. Applications to Java. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 20--34, 1999. Google Scholar
Digital Library
- M. D. Bond and K. S. McKinley. Bell: Bit-encoding online memory leak detection. In phInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 61--72, 2006. Google Scholar
Digital Library
- M. D. Bond and K. S. McKinley. Probabilistic calling context. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 97--112, 2007. Google Scholar
Digital Library
- B. Calder, P. Feller, and A. Eustace. Value profiling. In phInternational Symposium on Microarchitecture (MICRO), pages 259--269, 1997. Google Scholar
Digital Library
- A. E. Chis, N. Mitchell, E. Schonberg, G. Sevitsky, P. O'Sullivan, T. Parsons, and J. Murphy. Patterns of memory inefficiency. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 383--407, 2011. Google Scholar
Digital Library
- J. Choi, M. Gupta, M. Serrano, V. Sreedhar, and S. Midkiff. Escape analysis for Java. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 1--19, 1999. Google Scholar
Digital Library
- B. Dufour, B. G. Ryder, and G. Sevitsky. A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications. In phACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), pages 59--70, 2008. Google Scholar
Digital Library
- D. Gay and B. Steensgaard. Fast escape analysis and stack allocation for object-based programs. In phInternational Conference on Compiler Construction (CC), LNCS 1781, pages 82--93, 2000. Google Scholar
Digital Library
- R. F. Geary, R. Raman, and V. Raman. Succinct ordinal trees with level-ancestor queries. In phACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1--10, 2004. Google Scholar
Digital Library
- O. Gheorghioiu, A. Salcianu, and M. Rinard. Interprocedural compatibility analysis for static object preallocation. In phACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 273--284, 2003. Google Scholar
Digital Library
- hertz-toplas06M. Hertz, S. M. Blackburn, J. E. B. Moss, K. S. McKinley, and D. Stefanović. Generating object lifetime traces with Merlin. phACM Transactions on Programming Languages and Systems, 28 (3): 476--516, 2006. Google Scholar
Digital Library
- J. Jansson, K. Sadakane, and W.-K. Sung. Ultra-succinct representation of ordered trees. In phACM-SIAM Symposium on Discrete Algorithms (SODA), pages 575--584, 2007. Google Scholar
Digital Library
- R. E. Jones and C. Ryder. A study of Java object demographics. In phInternational Symposium on Memory Management (ISMM), pages 121--130, 2008. Google Scholar
Digital Library
- M. Jump and K. S. McKinley. Cork: Dynamic memory leak detection for garbage-collected languages. In phACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 31--38, 2007. Google Scholar
Digital Library
- D. Marinov and R. O'Callahan. Object equality profiling. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 313--325, 2003. Google Scholar
Digital Library
- N. Mitchell. The runtime structure of object ownership. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 74--98, 2006. Google Scholar
Digital Library
- N. Mitchell and G. Sevitsky. The causes of bloat, the limits of health. phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 245--260, 2007. Google Scholar
Digital Library
- N. Mitchell, G. Sevitsky, and H. Srinivasan. Modeling runtime behavior in framework-based applications. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 429--451, 2006. Google Scholar
Digital Library
- N. Mitchell, E. Schonberg, and G. Sevitsky. Making sense of large heaps. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 77--97, 2009. Google Scholar
Digital Library
- N. Mitchell, E. Schonberg, and G. Sevitsky. Four trends leading to Java runtime bloat. phIEEE Software, 27 (1): 56--63, 2010. Google Scholar
Digital Library
- J. Munro and V. Raman. Succinct representation of balanced parentheses, static trees and planar graphs. In phIEEE Symposium on Foundations of Computer Science (FOCS), pages 118--126, 1997. Google Scholar
Digital Library
- J. I. Munro and V. Raman. Succinct representation of balanced parentheses and static trees. phSIAM J. Comput., 31 (3): 762--776, 2001. Google Scholar
Digital Library
- C. Reichenbach, N. Immerman, Y. Smaragdakis, E. Aftandilian, and S. Z. Guyer. What can the GC compute efficiently? A language for heap assertions at GC time. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 256--269, 2010. Google Scholar
Digital Library
- C. Ruggieri and T. P. Murtagh. Lifetime analysis of dynamically allocated objects. In phACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 285--293, 1988. Google Scholar
Digital Library
- J. B. Sartor, M. Hirzel, and K. S. McKinley. No bit left behind: the limits of heap data compression. In phInternational Symposium on Memory Management (ISMM), pages 111--120, 2008. Google Scholar
Digital Library
- J. B. Sartor, S. M. Blackburn, D. Frampton, M. Hirzel, and K. S. McKinley. Z-rays: divide arrays and conquer speed and flexibility. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 471--482, 2010. Google Scholar
Digital Library
- O. Shacham, M. Vechev, and E. Yahav. Chameleon: Adaptive selection of collections. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 408--418, 2009. Google Scholar
Digital Library
- A. Shankar, M. Arnold, and R. Bodik. JOLT: Lightweight dynamic analysis and removal of object churn. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 127--142, 2008. Google Scholar
Digital Library
- M. Vechev, E. Yahav, and G. Yorsh. PHALANX: Parallel checking of expressive heap assertions. In phInternational Symposium on Memory Management (ISMM), pages 41--50, 2010. Google Scholar
Digital Library
- J. Whaley and M. Rinard. Compositional pointer and escape analysis for Java programs. In phACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 187--206, 1999. Google Scholar
Digital Library
- G. Xu. phAnalyzing Large-Scale Object-Oriented Software to Find and Remove Runtime Bloat. PhD thesis, The Ohio State University, 2011. Google Scholar
Digital Library
- G. Xu and A. Rountev. Precise memory leak detection for Java software using container profiling. In phInternational Conference on Software Engineering (ICSE), pages 151--160, 2008. Google Scholar
Digital Library
- G. Xu and A. Rountev. Detecting inefficiently-used containers to avoid bloat. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 160--173, 2010. Google Scholar
Digital Library
- G. Xu, M. Arnold, N. Mitchell, A. Rountev, and G. Sevitsky. Go with the flow: Profiling copies to find runtime bloat. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 419--430, 2009. Google Scholar
Digital Library
- Xu, Arnold, Mitchell, Rountev, Schonberg, and Sevitsky}xu-pldi10-aG. Xu, M. Arnold, N. Mitchell, A. Rountev, E. Schonberg, and G. Sevitsky. Finding low-utility data structures. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 174--186, 2010\natexlaba. Google Scholar
Digital Library
- Xu, Mitchell, Arnold, Rountev, and Sevitsky}xu-foser10G. Xu, N. Mitchell, M. Arnold, A. Rountev, and G. Sevitsky. Software bloat analysis: Finding, removing, and preventing performance problems in modern large-scale object-oriented applications. In phFSE/SDP Working Conference on the Future of Software Engineering Research (FoSER), pages 421--426, 2010\natexlabb. Google Scholar
Digital Library
- G. Xu, M. D. Bond, F. Qin, and A. Rountev. Leakchaser: Helping programmers narrow down causes of memory leaks. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 270--282, 2011. Google Scholar
Digital Library
- G. Xu, D. Yan, and A. Rountev. Static detection of loop-invariant data structures. In phEuropean Conference on Object-Oriented Programming (ECOOP), pages 738--763, 2012. Google Scholar
Digital Library
- D. Yan, G. Xu, and A. Rountev. Uncovering performance problems in Java applications with reference propagation profiling. In phInternational Conference on Software Engineering (ICSE), pages 134--144, 2012. Google Scholar
Digital Library
- X. Zhang. phFault Localization via Precise Dynamic Slicing. PhD thesis, University of Arizona, 2006. Google Scholar
Digital Library
- X. Zhang and R. Gupta. Cost effective dynamic program slicing. In phACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 94--106, 2004. Google Scholar
Digital Library
- X. Zhang, R. Gupta, and Y. Zhang. Precise dynamic slicing algorithms. In phInternational Conference on Software Engineering (ICSE), pages 319--329, 2003. Google Scholar
Digital Library
Index Terms
Finding reusable data structures
Recommendations
Finding reusable data structures
OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applicationsA big source of run-time performance problems in large-scale, object-oriented applications is the frequent creation of data structures (by the same allocation site) whose lifetimes are disjoint, and whose shapes and data content are always the same. ...
A software reuse approach for developing Grab-and-Glue models
SMO'05: Proceedings of the 5th WSEAS international conference on Simulation, modelling and optimizationA Grab-and-Glue framework (Grab-and-Glue) has been proposed as a potential solution to deal with the problem of time-consuming and high cost during the development of a simulation model caused by classical simulation modelling frameworks. However, the ...
Smart data structures: an online machine learning approach to multicore data structures
ICAC '11: Proceedings of the 8th ACM international conference on Autonomic computingAs multicores become prevalent, the complexity of programming is skyrocketing. One major difficulty is efficiently orchestrating collaboration among threads through shared data structures. Unfortunately, choosing and hand-tuning data structure ...







Comments