Abstract
The CSI framework provides comprehensive static instrumentation that a compiler can insert into a program-under-test so that dynamic-analysis tools - memory checkers, race detectors, cache simulators, performance profilers, code-coverage analyzers, etc. - can observe and investigate runtime behavior. Heretofore, tools based on compiler instrumentation would each separately modify the compiler to insert their own instrumentation. In contrast, CSI inserts a standard collection of instrumentation hooks into the program-under-test. Each CSI-tool is implemented as a library that defines relevant hooks, and the remaining hooks are "nulled" out and elided during either compile-time or link-time optimization, resulting in instrumented runtimes on par with custom instrumentation. CSI allows many compiler-based tools to be written as simple libraries without modifying the compiler, lowering the bar for the development of dynamic-analysis tools.
We have defined a standard API for CSI and modified LLVM to insert CSI hooks into the compiler's internal representation (IR) of the program. The API organizes IR objects - such as functions, basic blocks, and memory accesses - into flat and compact ID spaces, which not only simplifies the building of tools, but surprisingly enables faster maintenance of IR-object data than do traditional hash tables. CSI hooks contain a "property" parameter that allows tools to customize behavior based on static information without introducing overhead. CSI provides "forensic" tables that tools can use to associate IR objects with source-code locations and to relate IR objects to each other.
To evaluate the efficacy of CSI, we implemented six demonstration CSI-tools. One of our studies shows that compiling with CSI and linking with the "null" CSI-tool produces a tool-instrumented executable that is as fast as the original uninstrumented code. Another study, using a CSI port of Google's ThreadSanitizer, shows that the CSI-tool rivals the performance of Google's custom compiler-based implementation. All other demonstration CSI tools slow down the execution of the program-under-test by less than 70%.
- A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (second ed.). Google Scholar
Digital Library
- Apache Software Foundation. 2016. ab -- Apache HTTP server benchmarking tool. Available at https://httpd.apache.org/docs/2.4/programs/ab.html. (2016).Google Scholar
- David R. Barach, David H. Taenzer, and Robert E. Wells. 1982. A Technique for Finding Storage Allocation Errors in C-language Programs. SIGPLAN Notices, Vol. 17, 5 (May 1982), 16--24. Google Scholar
Digital Library
- Andrew R. Bernat and Barton P. Miller. 2011. Anywhere, Any-time Binary Instrumentation. In PASTE. 9--16. Google Scholar
Digital Library
- Walter Binder, Alex Villazón, Danilo Ansaloni, and Philippe Moret. 2009. @J: Towards Rapid Development of Dynamic Analysis Tools for the Java Virtual Machine. VMIL. Article 4,9 pages. Google Scholar
Digital Library
- Derek Bruening, Evelyn Duesterwald, and Saman Amarasinghe. 2001. Design and Implementation of a Dynamic Optimization Framework for Windows. FDDO-4.Google Scholar
- Eric Bruneton, Romain Lenglet, and Thierry Coupaye. 2002. ASM: A code manipulation tool to implement adaptable systems. Adaptable and Extensible Component Systems.Google Scholar
- Randal E. Bryant and David R. O’Hallaron. 2015. Computer Systems: A Programmer's Perspective (3rd ed.). Pearson, USA. Google Scholar
Digital Library
- Bryan Buck and Jeffrey K. Hollingsworth. 2000. An API for Runtime Code Patching. Int. J. High Perform. Comput. Appl. Vol. 14, 4 (Nov.2000), 317--329. Google Scholar
Digital Library
- Mark Callaghan and Domas Mituzas. 2009. Poor Man's Profiler. Available at https://dom.as/2009/02/15/poor-mans-contention-profiling. (Feb.2009).Google Scholar
- Clang. 2017. Clang 6 Documentatoin: ThreadSanitizer. http://clang.llvm.org/docs/ThreadSanitizer.html. (2017).Google Scholar
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms (third ed.). The MIT Press. Google Scholar
Digital Library
- Thomas Coudray, Arnaud Fontaine, and Pierre Chifflier. 2015. Picon: Control Flow Integrity on LLVM IR. In SSTIC.Google Scholar
- Markus Dahm. 1999. Byte Code Engineering. 267--277.Google Scholar
- Bruno De Bus, Dominique Chanet, Bjorn De Sutter, Ludo Van Put, and Koen De Bosschere. 2004. The Design and Implementation of FIT: A Flexible Instrumentation Toolkit. PASTE. 29--34. Google Scholar
Digital Library
- Chen Ding and Yutao Zhong. 2003. Predicting Whole-program Locality Through Reuse Distance Analysis. PLDI. 245--257. Google Scholar
Digital Library
- Anne Dinning and Edith Schonberg. 1991. Detecting Access Anomalies in Programs with Critical Sections. PADD. 85--96. Google Scholar
Digital Library
- Anne Carolyn Dinning. . 1990. Detecting Nondeterminism in Shared Memory Parallel Programs. Ph.D. Dissertation. Department of Computer Science, New York University.Google Scholar
- DWARF Standards Committee. 2015. DWARF Debugging Information Format Version 4. (2015).Google Scholar
- Mingdong Feng and Charles E. Leiserson. 1999. Efficient Detection of Determinacy Races in Cilk Programs. Theory of Computing Systems Vol. 32, 3 (1999), 301--326.Google Scholar
Cross Ref
- Free Software Foundation. 2009. GCC Wiki: LinkTimeOptimization. Available at https://gcc.gnu.org/wiki/LinkTimeOptimization. (Oct. . 2009).Google Scholar
- Free Software Foundation. 2014. GNU Binutils. Available at https://www.gnu.org/software/binutils/. (September 2014).Google Scholar
- Free Software Foundation. 2017. GNU Compiler Collection (GCC) Internals.Google Scholar
- Free Software Foundation. 2017. GNU Compiler Collection (GCC) Internals: Plugins. Available at https://gcc.gnu.org/onlinedocs/gccint/Plugins.html. (April 2017).Google Scholar
- Felix Garcia and Javier Fernandez. 2000. POSIX Thread Libraries. Linux Journal, Vol. 2000, 70es (Feb.2000). Google Scholar
Digital Library
- Saturnino Garcia, Donghwan Jeon, Christopher M. Louie, and Michael Bedford Taylor. 2011. Kremlin: Rethinking and Rebooting gprof for the Multicore Age. SIGPLAN Not. Vol. 46 (June 2011), 458--469. Issue 6. Google Scholar
Digital Library
- Google, Inc. 2015. Google C++Style Guide. Available at https://google.github.io/styleguide/cppguide.html. (2015).Google Scholar
- Susan L. Graham, Peter B. Kessler, and Marshall K. McKusick. 1982. gprof: A Call Graph Execution Profiler. SIGPLAN'82 Symposium on Compiler Construction. 120--126. Google Scholar
Digital Library
- Reed Hastings and Bob Joyce. 1992. Purify: Fast detection of memory leaks and access errors. Winter 1992 USENIX Conference. 125--138.Google Scholar
- Yuxiong He, Charles E. Leiserson, and William M. Leiserson. 2010. The Cilkview Scalability Analyzer. In SPAA. 145--156. Google Scholar
Digital Library
- Intel Corporation. 2015. Pin 2.14 User Guide. Available at https://software.intel.com/sites/landingpage/pintool/docs/71313/Pin/html/index.html. (January 2015).Google Scholar
- Rohit Jalan and Arun Kejariwal. 2012. Trin-Trin: Who's calling? A Pin-Based Dynamic Call Graph Extraction Framework. International Journal of Parallel Programming, Vol. 40, 4 (2012), 410--442.Google Scholar
Cross Ref
- Donghwan Jeon, Saturnino Garcia, Chris Louie, and Michael Bedford Taylor. 2011. Kismet: Parallel Speedup Estimates for Serial Programs. OOPSLA. 519--536. Google Scholar
Digital Library
- Teresa Johnson, Mehdi Amini, and Xinliang David Li. 2017. ThinLTO: Scalable and Incremental LTO. In CGO. 111--121. Google Scholar
Digital Library
- Gregor Kiczales, Erik Hilsdale, Jim Hugunin, Mik Kersten, Jeffrey Palm, and William G. Griswold. 2001. An Overview of AspectJ. In ECOOP. 327--353. Google Scholar
Digital Library
- Leslie Lamport. 1978. Time, Clocks and the Ordering of Events in a Distributed System. (July 1978), 558--565. Google Scholar
Digital Library
- James R. Larus and Eric Schnarr. 1995. EEL: Machine-independent Executable Editing. In PLDI. 291--300. Google Scholar
Digital Library
- Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. CGO. 75. Google Scholar
Digital Library
- Han Bok Lee and Benjamin Zorn. 1997. BIT: A Tool for Instrumenting Java Bytecodes. USENIX Symposium on Internet Technologies and Systems. USENIX Association, 73--82. Google Scholar
Digital Library
- I-Ting Angelina Lee and Tao B. Schardl. 2015. Efficiently Detecting Races in Cilk Programs That Use Reducer Hyperobjects. SPAA. 111--122. Google Scholar
Digital Library
- LLVM Project. 2016. LLVM Language Reference Manual. Available at http://llvm.org/docs/LangRef.html. (2016).Google Scholar
- LLVM Project. 2016. LLVM Link Time Optimization: Design and Implementation. Available at http://llvm.org/docs/LinkTimeOptimization.html. (2016).Google Scholar
- LLVM Project. 2016. Writing an LLVM Pass. Available at http://llvm.org/docs/WritingAnLLVMPass.html. (2016).Google Scholar
- Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. PLDI. 190--200. Google Scholar
Digital Library
- Lukávs Marek, Alex Villazón, Yudi Zheng, Danilo Ansaloni, Walter Binder, and Zhengwei Qi. 2012. DiSL: A Domain-Specific Language for Bytecode Instrumentation. AOSD. 239--250. Google Scholar
Digital Library
- John Mellor-Crummey. 1991. On-the-fly Detection of Data Races for Programs with Nested Fork-Join Parallelism. Supercomputing. 24--33. Google Scholar
Digital Library
- John Mellor-Crummey. 1993. Compile-Time Support for Efficient Data Race Detection in Shared-Memory Parallel Programs. PADD. 129--139. Google Scholar
Digital Library
- Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavyweight dynamic binary instrumentation. PLDI. 89--100. Google Scholar
Digital Library
- Oracle. 2004. JVM#8482; Tool Interface (JVM TI). Available at http://docs.oracle.com/javase/1.5.0/docs/guide/jvmti/index.html. (2004).Google Scholar
- James Reinders. 2005. VTune Performance Analyzer Essentials. Intel Press.Google Scholar
- Ted Romer, Geoff Voelker, Dennis Lee, Alec Wolman, Wayne Wong, Hank Levy, Brian Bershad, and Brad Chen. 1997. Instrumentation and Optimization of Win32/Intel Executables Using Etch. NT. 1--7. Google Scholar
Digital Library
- Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A Dynamic Race Detector for Multi-Threaded Programs. ACM Transactions on Computer Systems Vol. 15, 4 (Nov. 1997), 391--411. Google Scholar
Digital Library
- Tao B. Schardl, Bradley C. Kuszmaul, I-Ting Angelina Lee, William M. Leiserson, and Charles E. Leiserson. 2015. The Cilkprof Scalability Profiler. In SPAA. 89--100. Google Scholar
Digital Library
- Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. USENIX Annual Technical Conference. Google Scholar
Digital Library
- Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer -- Data Race Detection in Practice. WBIA. 62--71. Google Scholar
Digital Library
- Konstantin Serebryany, Alexander Potapenko, Timur Iskhodzhanov, and Dmitry Vyukov. 2011. Dynamic Race Detection with LLVM Compiler. Technical Report 37278. Google.Google Scholar
- Sameer Shende, Allen D. Malony, Janice Cuny, Peter Beckman, Steve Karmesin, and Kathleen Lindlan. 1998. Portable Profiling and Tracing for Parallel, Scientific Applications Using C++. SPDT. 134--145. Google Scholar
Digital Library
- Michael D. Smith. 1991. Tracing with pixie. Technical Report CSL-TR-91--497. Stanford University.Google Scholar
- Amitabh Srivastava and Alan Eustace. 1994. ATOM: A System for Building Customized Program Analysis Tools. PLDI. 196--205. Google Scholar
Digital Library
- Amitabh Srivastava and David W. Wall. 1992. A Practical System for Intermodule Code Optimization at Link-Time. Technical Report 92/6. Digital Western Research Laboratory.Google Scholar
- Richard M. Stallman and the GCC Developer Community. 2016. Using the GNU Compiler Collection (for GCC version 6.1.0). Free Software Foundation.Google Scholar
- Basile Starynkevitch. 2011. MELT -- A Translated Domain Specific Language Embedded in the GCC Compiler. DSL.Google Scholar
- Mark Stephenson, Siva Kumar Sastry Hari, Yunsup Lee, Eiman Ebrahimi, Daniel R. Johnson, David Nellans, Mike O'Connor, and Stephen W. Keckler. 2015. Flexible Software Profiling of GPU Architectures. ISCA. 185--197. Google Scholar
Digital Library
- Rabin A. Sugumar and Santosh G. Abraham. 1993. Efficient Simulation of Caches Under Optimal Replacement with Applications to Miss Characterization. In SIGMETRICS. 24--35. Google Scholar
Digital Library
- Ian Lance Taylor. 2008. A new ELF linker. Available at https://research.google.com/pubs/archive/34417.pdf. (2008).Google Scholar
- The Clang Team. 2017. Clang Plugins -- Clang 5 Documentation. Available at https://clang.llvm.org/docs/ClangPlugins.html. (2017).Google Scholar
- Mustafa M Tikir and Jeffrey K Hollingsworth. 2002. Efficient instrumentation for code coverage testing. ISSTA. 86--96. Google Scholar
Digital Library
- Gang-Ryung Uh, Robert Cohn, Bharadwaj Yadavalli, Ramesh Peri, and Ravi Ayyagari. 2006. Analyzing Dynamic Binary Instrumentation Overhead.Google Scholar
- Raja Vallée-Rai, Etienne Gagnon, Laurie Hendren, Patrick Lam, Patrice Pominville, and Vijay Sundaresan. 2000. Optimizing Java Bytecode Using the Soot Framework: Is It Feasible? CC. Lecture Notes in Computer Science, Vol. Vol. 1781. 18--34. Google Scholar
Digital Library
- William von Hagen. 2006. The Definitive Guide to GCC (second ed.). Apress, Chapter 6. Google Scholar
Digital Library
- David W. Wall. 1989. Link-Time Code Modification. Technical Report 89/17. Digital Western Research Laboratory.Google Scholar
- Josef Weidendorfer. 2008. Sequential Performance Analysis with Callgrind and KCachegrind. 2nd International Workshop on Parallel Tools for High Performance Computing. 93--113.Google Scholar
- David A. Wheeler. 2001. More Than a Gigabuck: Estimating GNU/Linux's Size. Available at http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.html. (June 2001).Google Scholar
- Abe White. 2007. Serp. Available at http://serp.sourceforge.net/. (2007).Google Scholar
- Yuan Yu, Tom Rodeheffer, and Wei Chen. 2005. RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Tracking. SOSP. 221--234. Google Scholar
Digital Library
Index Terms
The CSI Framework for Compiler-Inserted Program Instrumentation
Recommendations
The CSI Framework for Compiler-Inserted Program Instrumentation
SIGMETRICS '18: Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer SystemsThe CSI framework provides comprehensive static instrumentation that a compiler can insert into a program-under-test so that dynamic-analysis tools - memory checkers, race detectors, cache simulators, performance profilers, code-coverage analyzers, etc. ...
The CSI Framework for Compiler-Inserted Program Instrumentation
SIGMETRICS '18The CSI framework provides comprehensive static instrumentation that a compiler can insert into a program-under-test so that dynamic-analysis tools - memory checkers, race detectors, cache simulators, performance profilers, code-coverage analyzers, etc. ...
Automatic Low Overhead Program Instrumentation with the LOPI Framework
INTERACT '05: Proceedings of the 9th Annual Workshop on Interaction between Compilers and Computer ArchitecturesProgram instrumentation is an important technique for a different tasks such as performance measurements, debugging, and coverage analysis. Instrumentation, however, poses two important requirements to be useful: it must be easy to apply and it should ...






Comments