ABSTRACT
Calling context enhances program understanding and dynamic analyses by providing a rich representation of program location. Compared to imperative programs, object-oriented programs use more interprocedural and less intraprocedural control flow, increasing the importance of context sensitivity for analysis. However, prior online methods for computing calling context, such as stack-walking or maintaining the current location in a calling context tree, are expensive in time and space. This paper introduces a new online approach called probabilistic calling context (PCC) that continuously maintains a probabilistically unique value representing the current calling context. For millions of unique contexts, a 32-bit PCC value has few conflicts. Computing the PCC value adds 3% average overhead to a Java virtual machine. PCC is well-suited to clients that detect new or anomalous behavior since PCC values from training and production runs can be compared easily to detect new context-sensitive behavior; clients that query the PCC value at every system call, Java utility call, and Java API call add 0-9% overhead on average. PCC adds space overhead proportional to the distinct contexts stored by the client (one word per context). Our results indicate PCC is efficient and accurate enough to use in deployed software for residual testing, bug detection, and intrusion detection.
- B. Alpern, C. R. Attanasio, J. J. Barton, M. G. Burke, P. Cheng, J.-D. Choi, A. Cocchi, S. J. Fink, D. Grove, M. Hind, S. F. Hummel, D. Lieber, V. Litvinov, M. Mergen, T. Ngo, J. R. Russell, V. Sarkar, M. J. Serrano, J. Shepherd, S. Smith, V. C. Sreedhar, H. Srinivasan, and J. Whaley. The Jalapeño Virtual Machine. IBM Systems Journal, 39(1):211--238, 2000. Google Scholar
Digital Library
- G. Ammons, T. Ball, and J. R. Larus. Exploiting Hardware Performance Counters with Flow and Context Sensitive Profiling. In ACM Conference on Programming Language Design and Implementation, pages 85--96, Las Vegas, NV, 1997. Google Scholar
Digital Library
- T. Apiwattanapong and M. J. Harrold. Selective Path Profiling. In ACM Workshop on Program Analysis for Software Tools and Engineering, pages 35--42, 2002. Google Scholar
Digital Library
- M. Arnold, S. J. Fink, D. Grove, M. Hind, and P. F. Sweeney. Adaptive Optimization in the Jalapeño JVM. In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 47--65, 2000. Google Scholar
Digital Library
- M. Arnold, M. Hind, and B. G. Ryder. An Empirical Study of Selective Optimization. In International Workshop on Languages and Compilers for Parallel Computing, pages 49--67, London, UK, 2001. Springer-Verlag. Google Scholar
Digital Library
- T. Ball. The SLAM Toolkit: Debugging System Software via Static Analysis, 2001.Google Scholar
- T. Ball and J. R. Larus. Efficient Path Profiling. In IEEE/ACM International Symposium on Microarchitecture, pages 46--57, 1996. Google Scholar
Digital Library
- A. R. Bernat and B. P.Miller. Incremental Call-Path Profiling. Concurrency and Computation: Practice and Experience, 2006. Google Scholar
Digital Library
- D. Binkley. Semantics Guided Regression Test Cost Reduction. IEEE Transactions on Software Engineering, 23(8):498--516, 1997. Google Scholar
Digital Library
- S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 169--190, 2006. Google Scholar
Digital Library
- A. Chakrabarti and P. Godefroid. Software Partitioning for Effective Automated Unit Testing. In ACM & IEEE International Conference on Embedded Software, pages 262--271, 2006. Google Scholar
Digital Library
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, chapter 11. The MIT Press, McGraw-Hill Book Company, 2nd edition, 2001.Google Scholar
- L. Fei and S. P. Midkiff. Artemis: Practical Runtime Monitoring of Applications for Execution Anomalies. In ACM Conference on Programming Language Design and Implementation, pages 84--95, 2006. Google Scholar
Digital Library
- H. H. Feng, O. M. Kolesnikov, P. Fogla, W. Lee, and W. Gong. Anomaly Detection Using Call Stack Information. In IEEE Symposium on Security and Privacy, page 62. IEEE Computer Society, 2003. Google Scholar
Digital Library
- N. Froyd, J. Mellor-Crummey, and R. Fowler. Low-Overhead Call Path Profiling of Unmodified, Optimized Code. In ACM International Conference on Supercomputing, pages 81--90, 2005. Google Scholar
Digital Library
- P. Godefroid, N. Klarlund, and K. Sen. DART: Directed Automated Random Testing. In ACM Conference on Programming Language Design and Implementation, pages 213--223, 2005. Google Scholar
Digital Library
- W. Gropp. Runtime Checking of Datatype Signatures inMPI. In European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, pages 160--167, London, UK, 2000. Springer- Verlag. Google Scholar
Digital Library
- J. Ha, C. J. Rossbach, J. V. Davis, I. Roy, H. E. Ramadan, D. E. Porter, D. L. Chen, and E. Witchel. Improved Error Reporting for Software that Uses Black Box Components. In ACM Conference on Programming Language Design and Implementation, pages 101--111, 2007. Google Scholar
Digital Library
- S. Hangal and M. S. Lam. Tracking Down Software Bugs Using Automatic Anomaly Detection. In ACM International Conference on Software Engineering, pages 291--301, 2002. Google Scholar
Digital Library
- M. J. Harrold, G. Rothermel, K. Sayre, R. Wu, and L. Yi. An Empirical Investigation of the Relationship Between Spectra Differences and Regression Faults. Software Testing, Verification & Reliability, 10(3):171--194, 2000.Google Scholar
Cross Ref
- K. Hazelwood and D. Grove. Adaptive Online Context-- Sensitive Inlining. In IEEE/ACM International Symposium on Code Generation and Optimization, pages 253--264, 2003. Google Scholar
Digital Library
- X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng. The Garbage Collection Advantage: Improving Program Locality. In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 69--80, 2004. Google Scholar
Digital Library
- H. Inoue. Anomaly Detection in Dynamic Execution Environments. PhD thesis, University of New Mexico, 2005. Google Scholar
Digital Library
- H. Inoue and S. Forrest. Anomaly Intrusion Detection in Dynamic Execution Environments. In Workshop on New Security Paradigms, pages 52--60, 2002. Google Scholar
Digital Library
- H. Inoue, D. Stefanović, and S. Forrest. On the Prediction of Java Object Liftimes. ACM Transactions on Computer Systems, 55(7):880--892, 2006. Google Scholar
Digital Library
- Jikes RVM. http://www.jikesrvm.org.Google Scholar
- Jikes RVM Research Archive. http://www.jikesrvm.org/-Research+Archive.Google Scholar
- J. Langou, G. Bosilca, G. Fagg, and J. Dongarra. Hash Functions for Datatype Signatures in MPI. In European Parallel Virtual Machine and Message Passing Interface Conference, pages 76--83, 2005. Google Scholar
Digital Library
- B. Lee, K. Resnick, M. D. Bond, and K. S. McKinley. Correcting the Dynamic Call Graph Using Control Flow Constraints. In International Conference on Compiler Construction, 2007. Google Scholar
Digital Library
- B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Scalable Statistical Bug Isolation. In ACM Conference on Programming Language Design and Implementation, pages 15--26, 2005. Google Scholar
Digital Library
- C. Liu, X. Yan, H. Yu, J. Han, and P. S. Yu. Mining Behavior Graphs for Backtrace of Noncrashing Bugs. In SIAM International Converence on Data Mining, pages 286--297, 2005.Google Scholar
Cross Ref
- S. Lu, J. Tucek, F. Qin, and Y. Zhou. AVIO: Detecting Atomicity Violations via Access-Interleaving Invariants. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 37--48, 2006. Google Scholar
Digital Library
- D. Melski and T. Reps. Interprocedural Path Profiling. In International Conference on Compiler Construction, pages 47--62, 1999. Google Scholar
Digital Library
- M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, New York, NY, USA, 2005. Google Scholar
Digital Library
- G. J. Myers. The Art of Software Testing. Wiley, 1979. Google Scholar
Digital Library
- N. Nethercote and J. Seward. Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation. In ACM Conference on Programming Language Design and Implementation, pages 89--100, 2007. Google Scholar
Digital Library
- C. Pavlopoulou and M. Young. Residual test coverage montoring. In ACM International Conference on Software Engineering, pages 277--284, May 1999. Google Scholar
Digital Library
- F. Qian and L. Hendren. Towards Dynamic Interprocedural Analysis in JVMs. In USENIX Symposium on Virtual Machine Research and Technology, pages 139--150, 2004. Google Scholar
Digital Library
- A. Rountev, S. Kagan, and J. Sawin. Coverage Criteria for Testing of Object Interactions in Sequence Diagrams. In Fundamental Approaches to Software Engineering, LNCS 3442, pages 282--297, 2005.Google Scholar
- M. L. Seidl and B. G. Zorn. Segregating Heap Objects by Reference Behavior and Lifetime. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 12--23, 1998. Google Scholar
Digital Library
- J. Seward and N. Nethercote. Using Valgrind to Detect Undefined Value Errors with Bit-Precision. In USENIX Annual Technical Conference, pages 17--30, 2005. Google Scholar
Digital Library
- J. M. Spivey. Fast, Accurate Call Graph Profiling. Softw. Pract. Exper., 34(3):249--264, 2004. Google Scholar
Digital Library
- Standard Performance Evaluation Corporation. SPECjvm98 Documentation, release 1.03 edition, 1999.Google Scholar
- Standard Performance Evaluation Corporation. SPECjbb2000 Documentation, release 1.01 edition, 2001.Google Scholar
- T. Suganuma, T. Yasue, M. Kawahito, H. Komatsu, and T. Nakatani. A Dynamic Optimization Framework for a Java Just-in-Time Compiler. In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 180--195, 2001. Google Scholar
Digital Library
- TIOBE Software. TIOBE programming community index, 2007. http://tiobe.com.tpci.html.Google Scholar
- K. Vaswani, A. V. Nori, and T. M. Chilimbi. Preferential Path Profiling: Compactly Numbering Interesting Paths. In ACM Symposium on Principles of Programming Languages, pages 351--362, 2007. Google Scholar
Digital Library
- D. Wagner and P. Soto. Mimicry Attacks on Host-Based Intrusion Detection Systems. In ACM Conference on Computer and Communications Security, pages 255--264. ACM Press, 2002. Google Scholar
Digital Library
- J. Whaley. A Portable Sampling-Based Profiler for Java Virtual Machines. In ACM Conference on Java Grande, pages 78--87. ACM Press, 2000. Google Scholar
Digital Library
- B. Wiedermann. Know your Place: Selectively Executing Statements Based on Context. Technical Report TR-07-38, University of Texas at Austin, 2007.Google Scholar
- T. Zhang, X. Zhuang, S. Pande, and W. Lee. Anomalous Path Detection with Hardware Support. In International Conference on Compilers, Architectures and Synthesis for Embedded Systems, pages 43--54, 2005. Google Scholar
Digital Library
- T. Zhang, X. Zhuang, S. Pande, and W. Lee. Anomalous Path Detection with Hardware Support. In International Conference on Compilers, Architectures and Synthesis for Embedded Systems, pages 43--54, 2005. Google Scholar
Digital Library
Index Terms
Probabilistic calling context
Recommendations
Accurate, efficient, and adaptive calling context profiling
PLDI '06: Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and ImplementationCalling context profiles are used in many inter-procedural code optimizations and in overall program understanding. Unfortunately, the collection of profile information is highly intrusive due to the high frequency of method calls in most applications. ...
Probabilistic calling context
Proceedings of the 2007 OOPSLA conferenceCalling context enhances program understanding and dynamic analyses by providing a rich representation of program location. Compared to imperative programs, object-oriented programs use more interprocedural and less intraprocedural control flow, ...
Precise Calling Context Encoding
Calling contexts (CCs) are very important for a wide range of applications such as profiling, debugging, and event logging. Most applications perform expensive stack walking to recover contexts. The resulting contexts are often explicitly represented as ...







Comments