ABSTRACT
Instruction-grain lifeguards monitor the events of a running application at the level of individual instructions in order to identify and help mitigate application bugs and security exploits. Because such lifeguards impose a 10-100X slowdown on existing platforms, previous studies have proposed hardware designs to accelerate lifeguard processing. However, these accelerators are either tailored to a specific class of lifeguards or suitable only for monitoring singlethreaded programs.
We present ParaLog, the first design of a system enabling fast online parallel monitoring of multithreaded parallel applications. ParaLog supports a broad class of software-defined lifeguards. We show how three existing accelerators can be enhanced to support online multithreaded monitoring, dramatically reducing lifeguard overheads. We identify and solve several challenges in monitoring parallel applications and/or parallelizing these accelerators, including (i) enforcing inter-thread data dependences, (ii) dealing with inter-thread effects that are not reflected in coherence traffic, (iii) dealing with unmonitored operating system activity, and (iv) ensuring lifeguards can access shared metadata with negligible synchronization overheads. We present our system design for both Sequentially Consistent and Total Store Ordering processors. We implement and evaluate our design on a 16 core simulated CMP, using benchmarks from SPLASH-2 and PARSEC and two lifeguards: a data-flow tracking lifeguard and a memory-access checker lifeguard. Our results show that (i) our parallel accelerators improve performance by 2-9X and 1.13-3.4X for our two lifeguards, respectively, (ii) we are 5-126X faster than the time-slicing approach required by existing techniques, and (iii) our average overheads for applications with eight threads are 51% and 28% for the two lifeguards, respectively.
- C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In PACT, 2008. Google Scholar
Digital Library
- D. Bruening. Efficient, Transparent, and Comprehensive Runtime Code Manipulation. PhD thesis, MIT, 2004. Google Scholar
Digital Library
- W. R. Bush, J. D. Pincus, and D. J. Sielaff. A static analyzer for finding dynamic programming errors. Software -- Practice and Experience, 30(7), 2000. Google Scholar
Digital Library
- S. Chen, B. Falsafi, P. B. Gibbons, M. Kozuch, T. C. Mowry, R. Teodorescu, A. Ailamaki, L. Fix, G. R. Ganger, B. Lin, and S. W. Schlosser. Log-based architectures for general-purpose monitoring of deployed code. In ASID Workshop at ASPLOS, 2006. Google Scholar
Digital Library
- S. Chen, M. Kozuch, P. B. Gibbons, M. Ryan, T. Strigkos, T. C. Mowry, O. Ruwase, E. Vlachos, B. Falsafi, and V. Ramachandran. Flexible hardware acceleration for instruction-grain lifeguards. IEEE Micro, 29(1):62--72, 2009. Top Picks from the 2008 Computer Architecture Conferences. Google Scholar
Digital Library
- S. Chen, M. Kozuch, T. Strigkos, B. Falsafi, P. B. Gibbons, T. C. Mowry, V. Ramachandran, O. Ruwase, M. Ryan, and E. Vlachos. Flexible hardware acceleration for instruction-grain program monitoring. In ISCA, 2008. Google Scholar
Digital Library
- J. Chung, M. Dalton, H. Kannan, and C. Kozyrakis. Thread-safe dynamic binary translation using transactional memory. In HPCA, 2008.Google Scholar
- M. L. Corliss, E. C. Lewis, and A. Roth. DISE: A programmable macro engine for customizing applications. In ISCA, 2003. Google Scholar
Digital Library
- J. R. Crandall and F. T. Chong. Minos: Control data attack prevention orthogonal to memory model. In MICRO, 2004. Google Scholar
Digital Library
- M. Dalton, H. Kannan, and C. Kozyrakis. Raksha: A flexible information flow architecture for software security. In ISCA, 2007. Google Scholar
Digital Library
- D. Engler, B. Chelf, A. Chou, and S. Hallem. Checking system rules using system--specific, programmer-written compiler extensions. In OSDI, 2000. Google Scholar
Digital Library
- M. D. Ernst, J. Cockrell,W. G. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. IEEE Trans. Software Engineering, 27(2), 2001. Google Scholar
Digital Library
- C. Flanagan, K. R. M. Leino, M. Lillibridge, G. Nelson, J. B. Saxe, and R. Stata. Extended static checking for Java. In PLDI, 2002. Google Scholar
Digital Library
- D. Geels, G. Altekar, S. Shenker, and I. Stoica. Replay debugging for distributed applications. In USENIX ATEC, 2006. Google Scholar
Digital Library
- C. Gniady, B. Falsafi, and T. N. Vijaykumar. Is SC + ILP = RC? In ISCA, 1999. Google Scholar
Digital Library
- M. L. Goodstein, E. Vlachos, S. Chen, P. B. Gibbons, M. Kozuch, and T. C. Mowry. Butterfly analysis: Adapting dataflow analysis to dynamic parallel monitoring. In ASPLOS, 2010. Google Scholar
Digital Library
- M. Herlihy and J. E. B. Moss. Transactional memory: architectural support for lock-free data structures. In HPCA, 1993. Google Scholar
Digital Library
- D. R. Hower and M. D. Hill. Rerun: Exploiting episodes for lightweight memory race recording. In ISCA, 2008. Google Scholar
Digital Library
- HP Labs. Cacti 5.1 Technical Report. http://www.hpl.hp.com/research/cacti/.Google Scholar
- H. Kannan. Ordering decoupled metadata accesses in multiprocessors. In MICRO, 2009. Google Scholar
Digital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In PLDI, 2005. Google Scholar
Digital Library
- P. Montesinos, L. Ceze, and J. Torrellas. DeLorean: Recording and deterministically replaying shared-memory multiprocessor execution efficiently. In ISCA, 2008. Google Scholar
Digital Library
- P. Montesinos, M. Hicks, S. T. King, and J. Torrellas. Capo: A software-hardware interface for practical deterministic multiprocessor replay. In ASPLOS, 2009. Google Scholar
Digital Library
- S. S. Mukherjee, B. Falsafi, M. D. Hill, and D. A. Wood. Coherent network interfaces for fine-grain communication. In ISCA, 1996. Google Scholar
Digital Library
- V. Nagarajan and R. Gupta. Architectural support for shadow memory in multiprocessors. In VEE, 2009. Google Scholar
Digital Library
- S. Narayanasamy, C. Pereira, and B. Calder. Recording shared memory dependencies using strata. In ASPLOS, 2006. Google Scholar
Digital Library
- S. Narayanasamy, G. Pokam, and B. Calder. BugNet: Continuously recording program execution for deterministic replay debugging. In ISCA, 2005. Google Scholar
Digital Library
- N. Nethercote. Dynamic Binary Analysis and Instrumentation. PhD thesis, U. Cambridge, 2004. http://valgrind.org.Google Scholar
- N. Nethercote and J. Seward. Valgrind: A program supervision framework. Electronic Notes in Theoretical Computer Science, 89(2), 2003.Google Scholar
- N. Nethercote and J. Seward. How to shadow every byte of memory used by a program. In VEE, 2007. Google Scholar
Digital Library
- N. Nethercote and J. Seward. Valgrind: A framework for heavyweight dynamic binary instrumentation. In PLDI, 2007. Google Scholar
Digital Library
- J. Newsome and D. Song. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In NDSS, 2005.Google Scholar
- E. B. Nightingale, D. Peek, P. M. Chen, and J. Flinn. Parallelizing security checks on commodity hardware. In ASPLOS, 2008. Google Scholar
Digital Library
- G. Pokam, C. Pereira, K. Danne, R. Kassa, and A.-R. Adl-Tabatabai. Architecting a chunk--based memory race recorder in modern CMPs. In MICRO, 2009. Google Scholar
Digital Library
- F. Qin, C.Wang, Z. Li, H. Kim, Y. Zhou, and Y. Wu. LIFT: A low-overhead practical information flow tracking system for detecting security attacks. In MICRO, 2006. Google Scholar
Digital Library
- O. Ruwase, P. B. Gibbons, T. C. Mowry, V. Ramachandran, S. Chen, M. Kozuch, and M. Ryan. Parallelizing Dynamic Information Flow Tracking. In SPAA, 2008. Google Scholar
Digital Library
- S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A dynamic race detector for multi-threaded programs. ACM TOCS, 15(4), 1997. Google Scholar
Digital Library
- R. Shetty, M. Kharbutli, Y. Solihin, and M. Prvulovic. Heapmon: A helper-thread approach to programmable, automatic, and lowoverhead memory bug detection. IBM J. on Research and Development, 50(2/3), 2006. Google Scholar
Digital Library
- G. E. Suh, J. W. Lee, D. Zhang, and S. Devadas. Secure program execution via dynamic information flow tracking. In ASPLOS, 2004. Google Scholar
Digital Library
- G.-R. Uh, R. Cohn, B. Yadavalli, R. Peri, and R. Ayyagari. Analyzing dynamic binary instrumentation overhead. In WBIA Workshop at ASPLOS, 2006.Google Scholar
- G. Venkataramani, I. Doudalis, Y. Solihin, and M. Prvulovic. Flexi-Taint: A programmable accelerator for dynamic taint propagation. In HPCA, 2008.Google Scholar
Cross Ref
- G. Venkataramani, B. Roemer, Y. Solihin, and M. Prvulovic. Mem-Tracker: Efficient and programmable support for memory access monitoring and debugging. In HPCA, 2007. Google Scholar
Digital Library
- Virtutech Simics. http://www.virtutech.com/.Google Scholar
- S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In ISCA, 1995. Google Scholar
Digital Library
- M. Xu, R. Bodik, and M. D. Hill. A 'Flight Data Recorder' for enabling full-system multiprocessor deterministic replay. In ISCA, 2003. Google Scholar
Digital Library
- M. Xu, R. Bodik, and M. D. Hill. A regulated transitive reduction (RTR) for longer memory race recording. In ASPLOS, 2006. Google Scholar
Digital Library
- P. Zhou, R. Teodorescu, and Y. Zhou. HARD: Hardware-assisted lockset-based race detection. In HPCA, 2007. Google Scholar
Digital Library
- Y. Zhou, P. Zhou, F. Qin,W. Liu, and J. Torrellas. Efficient and flexible architectural support for dynamic monitoring. ACM TACO, 2(1), 2005. Google Scholar
Digital Library
Index Terms
ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications
Recommendations
ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications
ASPLOS '10Instruction-grain lifeguards monitor the events of a running application at the level of individual instructions in order to identify and help mitigate application bugs and security exploits. Because such lifeguards impose a 10-100X slowdown on existing ...
ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications
ASPLOS '10Instruction-grain lifeguards monitor the events of a running application at the level of individual instructions in order to identify and help mitigate application bugs and security exploits. Because such lifeguards impose a 10-100X slowdown on existing ...
Efficient Java exception handling in just-in-time compilation
Research ArticlesJava uses exceptions to provide elegant error handling capabilities during program execution. However, the presence of exception handlers complicates the job of the just-in-time (JIT) compiler, while exceptions are rarely used in most programs. This ...








Comments