Abstract
Concurrency errors in multithreaded programs are difficult to find and fix. We propose Aviso, a system for avoiding schedule-dependent failures. Aviso monitors events during a program's execution and, when a failure occurs, records a history of events from the failing execution. It uses this history to generate schedule constraints that perturb the order of events in the execution and thereby avoids schedules that lead to failures in future program executions. Aviso leverages scenarios where many instances of the same software run, using a statistical model of program behavior and experimentation to determine which constraints most effectively avoid failures. After implementing Aviso, we showed that it decreased failure rates for a variety of important desktop, server, and cloud applications by orders of magnitude, with an average overhead of less than 20% and, in some cases, as low as 5%.
- Issue 127: incr/decr operations are not thread safe. http://code.google.com/p/memcached/issues/detail?id=127.Google Scholar
- T. Bergan et al. Coredet: a compiler and runtime system for deterministic multithreaded execution. In ASPLOS, 2010. Google Scholar
Digital Library
- S. Burckhardt, P. Kothari, M. Musuvathi, and S. Nagarakatte. A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs. In ASPLOS, 2010. Google Scholar
Digital Library
- G. Jin et al. Automatic atomicity-violation fixing. In PLDI, 2011. Google Scholar
Digital Library
- G. Jin, A. Thakur, B. Liblit, and S. Lu. Instrumentation and sampling strategies for Cooperative Concurrency Bug Isolation. In OOPSLA, 2010. Google Scholar
Digital Library
- H. Jula, P. Tozun, and G. Candea. Communix: A collaborative deadlock immunity framework. In DSN, 2011. Google Scholar
Digital Library
- H. Jula, D. M. Tralamazza, C. Zamfir, and G. Candea. Deadlock immunity: Enabling systems to defend against deadlocks. In OSDI, pages 295--308, 2008. Google Scholar
Digital Library
- B. Kasikci, C. Zamfir, and G. Candea. Data races vs. data race bugs: telling the difference with portend. In ASPLOS '12, 2012. Google Scholar
Digital Library
- C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In CGO, 2004. Google Scholar
Digital Library
- S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from Mistakes - A Comprehensive Study on Real World Concurrency Bug Characteristics. In ASPLOS, 2008. Google Scholar
Digital Library
- S. Lu, J. Tucek, F. Qin, and Y. Zhou. AVIO: Detecting Atomicity Violations via Access Interleaving Invariants. In ASPLOS, 2006. Google Scholar
Digital Library
- B. Lucia, L. Ceze, and K. Strauss. ColorSafe: Architectural Support for Debugging and Dynamically Avoiding Multi-Variable Atomicity Violations. In ISCA, 2010. Google Scholar
Digital Library
- B. Lucia, J. Devietti, K. Strauss, and L. Ceze. Atom-Aid: Detecting and Surviving Atomicity Violations. In ISCA, 2008. Google Scholar
Digital Library
- C.-K. Luk et al. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In PLDI, 2005. Google Scholar
Digital Library
- P. Montesinos, L. Ceze, and J. Torrellas. DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Efficiently. In ISCA, 2008. Google Scholar
Digital Library
- G. Novark, E. D. Berger, and B. G. Zorn. Exterminator: automatically correcting memory errors with high probability. In PLDI '07, 2007. Google Scholar
Digital Library
- M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: efficient deterministic multithreading in software. In ASPLOS, 2009. Google Scholar
Digital Library
- S. Park, S. Lu, and Y. Zhou. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. In ASPLOS, 2009. Google Scholar
Digital Library
- J. H. Perkins et al. Automatically patching errors in deployed software. In SOSP, 2009. Google Scholar
Digital Library
- F. Qin, J. Tucek, J. Sundaresan, and Y. Zhou. Rx: treating bugs as allergies--a safe method to survive software failures. In SOSP, 2005. Google Scholar
Digital Library
- S. Rajamani et al. Isolator: Dynamically ensuring isolation in concurrent programs. In MICRO, 2009.Google Scholar
- P. Ratanaworabhan et al. Detecting and tolerating asymmetric races. In IEEE Transactions on Computers, 2011. Google Scholar
Digital Library
- M. Rinard et al. Enhancing server availability and security through failure-oblivious computing. In OSDI, 2004. Google Scholar
Digital Library
- Y. Shi et al. Do I use the wrong definition? defuse: definition-use invariants for detecting concurrency and sequential bugs. In OOPSLA, 2010. Google Scholar
Digital Library
- H. Volos, A. J. Tack, M. M. Swift, and S. Lu. Applying transactional memory to concurrency bugs. In ASPLOS 2012, 2012. Google Scholar
Digital Library
- J. Wu, H. Cui, and J. Yang. Bypassing races in live applications with execution filters. In OSDI, 2010. Google Scholar
Digital Library
- J. Yu and S. Narayanasamy. A Case for an Interleaving Constrained Shared-Memory Multi-Processor. In ISCA, 2009. Google Scholar
Digital Library
- J. Yu and S. Narayanasamy. Tolerating concurrency bugs using transactions as lifeguards. In MICRO, 2010. Google Scholar
Digital Library
Index Terms
Cooperative empirical failure avoidance for multithreaded programs
Recommendations
Cooperative empirical failure avoidance for multithreaded programs
ASPLOS '13Concurrency errors in multithreaded programs are difficult to find and fix. We propose Aviso, a system for avoiding schedule-dependent failures. Aviso monitors events during a program's execution and, when a failure occurs, records a history of events ...
Cooperative empirical failure avoidance for multithreaded programs
ASPLOS '13: Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systemsConcurrency errors in multithreaded programs are difficult to find and fix. We propose Aviso, a system for avoiding schedule-dependent failures. Aviso monitors events during a program's execution and, when a failure occurs, records a history of events ...
Cooperative crug isolation
WODA '09: Proceedings of the Seventh International Workshop on Dynamic AnalysisWith the widespread deployment of multi-core hardware, writing concurrent programs has become inescapable. This has made fixing concurrency bugs (or crugs) critical in modern software systems. Static analysis techniques to find crugs such as data races ...







Comments