skip to main content
research-article

Mining hot calling contexts in small space

Published:04 June 2011Publication History
Skip Abstract Section

Abstract

Calling context trees (CCTs) associate performance metrics with paths through a program's call graph, providing valuable information for program understanding and performance analysis. Although CCTs are typically much smaller than call trees, in real applications they might easily consist of tens of millions of distinct calling contexts: this sheer size makes them difficult to analyze and might hurt execution times due to poor access locality. For performance analysis, accurately collecting information about hot calling contexts may be more useful than constructing an entire CCT that includes millions of uninteresting paths. As we show for a variety of prominent Linux applications, the distribution of calling context frequencies is typically very skewed. In this paper we show how to exploit this property to reduce the CCT size considerably.

We introduce a novel run-time data structure, called Hot Calling Context Tree (HCCT), that offers an additional intermediate point in the spectrum of data structures for representing interprocedural control flow. The HCCT is a subtree of the CCT that includes only hot nodes and their ancestors. We show how to compute the HCCT without storing the exact frequency of all calling contexts, by using fast and space-efficient algorithms for mining frequent items in data streams. With this approach, we can distinguish between hot and cold contexts on the fly, while obtaining very accurate frequency counts. We show both theoretically and experimentally that the HCCT achieves a similar precision as the CCT in a much smaller space, roughly proportional to the number of distinct hot contexts: this is typically several orders of magnitude smaller than the total number of calling contexts encountered during a program's execution. Our space-efficient approach can be effectively combined with previous context-sensitive profiling techniques, such as sampling and bursting.

References

  1. G. Ammons and J. R. Larus. Improving data-flow analysis with path profiles. SIGPLAN Not., 39 (4): 568--582, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Ammons, T. Ball, and J. R. Larus. Exploiting hardware performance counters with flow and context sensitive profiling. SIGPLAN Not., 32 (5): 85--96, 1997. ISSN 0362-1340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Apiwattanapong and M. J. Harrold. Selective path profiling. In Proc. ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering, pages 35--42. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Arnold and B. Ryder. A framework for reducing the cost of instrumented code. In PLDI, pages 168--179. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Arnold and P. Sweeney. Approximating the calling context tree via sampling. Technical Report RC 21789, IBM Research, 2000.Google ScholarGoogle Scholar
  6. T. Ball and J. R. Larus. Efficient path profiling. In MICRO 29: Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture, pages 46--57, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Ball, P. Mataga, and M. Sagiv. Edge profiling versus path profiling: the showdown. In POPL, pages 134--148. ACM, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. R. Bernat and B. P. Miller. Incremental call-path profiling. Technical report, University of Wisconsin, 2004.Google ScholarGoogle Scholar
  9. }BM05M. D. Bond and K. S. McKinley. Continuous path and edge profiling. In Proc. 38th annual IEEE/ACM International Symposium on Microarchitecture, pages 130--140. IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. D. Bond and K. S. McKinley. Practical path profiling for dynamic optimizers. In CGO, pages 205--216. IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. D. Bond and K. S. McKinley. Probabilistic calling context. SIGPLAN Not. (proceedings of the 2007 OOPSLA conference), 42 (10): 97--112, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. P. Chang, S. A. Mahlke, W. Y. Chen, and W. mei W. Hwu. Profile-guided automatic inline expansion for c programs. Softw., Pract. Exper., 22 (5): 349--369, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Cormode and M. Hadjieleftheriou. Finding frequent items in data streams. Proceedings of the VLDB Endowment, 1 (2): 1530--1541, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Demetrescu and I. Finocchi. Algorithms for data streams. In Handbook of Applied Algorithms: Solving Scientific, Engineering, and Practical Problems. John Wiley and Sons, 2007.Google ScholarGoogle Scholar
  15. H. H. Feng, O. M. Kolesnikov, P. Fogla, W. Lee, and W. Gong. Anomaly detection using call stack information. In Proc. 2003 IEEE Symposium on Security and Privacy, SP '03, pages 62--. IEEE Computer Society, 2003. ISBN 0-7695-1940-7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Froyd, J. Mellor-Crummey, and R. Fowler. Low-overhead call path profiling of unmodified, optimized code. In Proc. 19th Annual International Conf. on Supercomputing, pages 81--90. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. L. Graham, P. B. Kessler, and M. K. McKusick. gprof: a call graph execution profiler (with retrospective). In K. S. McKinley, editor, Best of PLDI, pages 49--57. ACM, 1982. ISBN 1-58113-623-4.Google ScholarGoogle Scholar
  18. R. J. Hall. Call path refinement profiles. IEEE Trans. Softw. Eng., 21 (6): 481--496, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. J. Hall and A. J. Goldberg. Call path profiling of monotonic program resources in UNIX. In Proc. Summer 1993 USENIX Technical Conference, pages 1--19. USENIX Association, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Hirzel and T. Chilimbi. Bursty tracing: A framework for low-overhead temporal profiling. In Proc. 4th ACM Workshop on Feedback-Directed and Dynamic Optimization, 2001.Google ScholarGoogle Scholar
  21. R. Joshi, M. D. Bond, and C. Zilles. Targeted path profiling: Lower overhead path profiling for staged dynamic optimization systems. In CGO, page 239. IEEE Computer Society, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. R. Larus. Whole program paths. SIGPLAN Not., 34 (5): 259--269, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Liblit, A. Aiken, A. X. Zheng, and M. I. Jordan. Bug isolation via remote program sampling. In PLDI, pages 141--154. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI, pages 190--200, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. S. Manku and R. Motwani. Approximate frequency counts over data streams. In VLDB, pages 346--357. Morgan Kaufmann, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Melski and T. W. Reps. Interprocedural path profiling. In S. Jähnichen, editor, CC, volume 1575 of Lecture Notes in Computer Science, pages 47--62. Springer, 1999. ISBN 3--540--65717--7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Metwally, D. Agrawal, and A. E. Abbadi. An integrated efficient solution for computing frequent and top-k elements in data streams. ACM Trans. Database Syst., 31 (3): 1095--1133, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Muthukrishnan. Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science, 1 (2), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In PLDI, pages 89--100, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. Pavlopoulou and M. Young. Residual test coverage monitoring. In ICSE, pages 277--284, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Ponder and R. J. Fateman. Inaccuracies in program profilers. Softw., Pract. Exper., 18 (5): 459--467, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Roy and Y. N. Srikant. Profiling k-iteration paths: A generalization of the ball-larus profiling algorithm. In CGO, pages 70--80, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Serrano and X. Zhuang. Building approximate calling context from partial call traces. In CGO, pages 221--230, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. M. Spivey. Fast, accurate call graph profiling. Softw., Pract. Exper., 34 (3): 249--264, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. K. Vaswani, A. V. Nori, and T. M. Chilimbi. Preferential path profiling: compactly numbering interesting paths. In POPL, pages 351--362. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Whaley. A portable sampling-based profiler for Java virtual machines. In Proceedings of the ACM 2000 Conference on Java Grande, pages 78--87. ACM Press, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. X. Zhuang, M. J. Serrano, H. W. Cain, and J.-D. Choi. Accurate, efficient, and adaptive calling context profiling. In PLDI, pages 263--271, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Mining hot calling contexts in small space

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!