Abstract
Multicore technology is making concurrent programs increasingly pervasive. Unfortunately, it is difficult to deliver reliable concurrent programs, because of the huge and non-deterministic interleaving space. In reality, without the resources to thoroughly check the interleaving space, critical concurrency bugs can slip into production runs and cause failures in the field. Approaches to making the best use of the limited resources and exposing severe concurrency bugs before software release would be desirable.
Unlike previous work that focuses on bugs caused by specific interleavings (e.g., races and atomicity-violations), this paper targets concurrency bugs that result in one type of severe effects: program crashes. Our study of the error-propagation process of realworld concurrency bugs reveals a common pattern (50% in our non-deadlock concurrency bug set) that is highly correlated with program crashes. We call this pattern concurrency-memory bugs: buggy interleavings directly cause memory bugs (NULL-pointer-dereference, dangling-pointer, buffer-overflow, uninitialized-read) on shared memory objects.
Guided by this study, we built ConMem to monitor program execution, analyze memory accesses and synchronizations, and predicatively detect these common and severe concurrency-memory bugs. We also built a validator ConMem-v to automatically prune false positives by enforcing potential bug-triggering interleavings.
We evaluated ConMem using 7 open-source programs with 9 real-world severe concurrency bugs. ConMem detects more tested bugs (8 out of 9 bugs) than a lock-set-based race detector and an unserializable-interleaving detector that detect 4 and 5 bugs respectively, with a false positive rate about one tenth of the compared tools. ConMem-v further prunes out all the false positives. ConMem has reasonable overhead suitable for development usage.
- Apache Bugzilla. How important is the bug? http://issues.apache.org/bugwritinghelp.html.Google Scholar
- E. D. Berger, T. Yang, T. Liu, and G. Novark. Grace: safe multithreaded programming for c/c++. In OOPSLA, 2009. Google Scholar
Digital Library
- [email protected]. A bug's life cycle. https://bugzilla.mozilla.org/page.cgi?id=fields.html#severity.Google Scholar
- J. Burnim and K. Sen. Asserting and checking determinism for multithreaded programs. In FSE, 2009. Google Scholar
Digital Library
- C. Cadar, D. Dunbar, and D. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, 2008. Google Scholar
Digital Library
- F. Chen, T. F. Serbanuta, and G. Rosu. jpredictor: A predictive runtime analysis tool for java. In ICSE, 2008. Google Scholar
Digital Library
- R. Chugh, J. W. Voung, R. Jhala, and S. Lerner. Dataflow analysis for concurrent programs using datarace detection. In PLDI, 2008. Google Scholar
Digital Library
- Coverity. Software quality and security analysis. http://www.coverity.com/.Google Scholar
- J. Devietti, B. Lucia, L. Ceze, and M. Oskin. Dmp: deterministic shared memory multiprocessing. In ASPLOS, 2009. Google Scholar
Digital Library
- M. Dimitrov and H. Zhou. Anomaly-based bug prediction, isolation, and validation: an automated approach for software debugging. In ASPLOS, 2009. Google Scholar
Digital Library
- O. Edelstein, E. Farchi, Y. Nir, G. Ratsaby, and S. Ur. Multi-threaded java program test generation. IBM Systems Journal, 2002. Google Scholar
Digital Library
- E. Farchi, Y. Nir, and S. Ur. Concurrent bug patterns and how to test them. In IPDPS, 2003. Google Scholar
Digital Library
- C. Flanagan and S. N. Freund. Atomizer: a dynamic atomicity checker for multithreaded programs. In POPL, 2004. Google Scholar
Digital Library
- C. Flanagan and S. N. Freund. Fasttrack: efficient and precise dynamic race detection. In PLDI, 2009. Google Scholar
Digital Library
- C. Flanagan, S. N. Freund, and J. Yi. Velodrome: a sound and complete dynamic atomicity checker for multithreaded programs. In PLDI, 2008. Google Scholar
Digital Library
- P. J. Guo and D. Engler. Linux kernel developer responses to static analysis bug reports. In USENIX, 2009. Google Scholar
Digital Library
- R. Hastings and B. Joyce. Purify: Fast detection of memory leaks and access errors. In Usenix Winter Technical Conference, 1992.Google Scholar
- R. W. M. Jones and P. H. J. Kelly. Backwards-compatible bounds checking for arrays and pointers in c programs. In Automated and Algorithmic Debugging, 1997.Google Scholar
- H. Jula, D. Tralamazza, C. Zamfir, and G. Candea. Deadlock immunity: Enabling systems to defend against deadlocks. In OSDI, 2008. Google Scholar
Digital Library
- L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558--565, July 1978. Google Scholar
Digital Library
- T. Li, C. Ellis, A. Lebeck, and D. Sorin. On-demand and semantic-free dynamic deadlock detection with speculative execution. In USENIX ATC, 2005.Google Scholar
- S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from mistakes -- a comprehensive study of real world concurrency bug characteristics. In ASPLOS, 2008. Google Scholar
Digital Library
- S. Lu, J. Tucek, F. Qin, and Y. Zhou. AVIO: detecting atomicity violations via access interleaving invariants. In ASPLOS, 2006. Google Scholar
Digital Library
- B. Lucia, J. Devietti, K. Strauss, and L. Ceze. Atom-aid: Detecting and surviving atomicity violations. In ISCA, 2008. Google Scholar
Digital Library
- C.-K. Luk, R. Cohn, R.Muth, H. Patil, A. Klauser, G. Lowney, S.Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI, 2005. Google Scholar
Digital Library
- Mozilla Developers. Bug 123930 (deadlock). https://bugzilla.mozilla.org/show bug.cgi?id=123930. Let them eat races.Google Scholar
- M. Musuvathi, S. Qadeer, T. Ball, G. Basler, P. A. Nainar, and I. Neamtiu. Finding and reproducing heisenbugs in concurrent programs. In OSDI, 2008. Google Scholar
Digital Library
- S. Narayanasamy, C. Pereira, and B. Calder. Recording shared memory dependencies using strata. In ASPLOS, 2006. Google Scholar
Digital Library
- S. Narayanasamy, Z.Wang, J. Tigani, A. Edwards, and B. Calder. Automatically classifying benign and harmful data racesallusing replay analysis. In PLDI, 2007. Google Scholar
Digital Library
- N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In PLDI, 2007. Google Scholar
Digital Library
- R. H. B. Netzer and B. P. Miller. Improving the accuracy of data race detection. In PPoPP, 1991. Google Scholar
Digital Library
- M. Olszewski, J. Ansel, and S. P. Amarasinghe. Kendo: efficient deterministic multithreading in software. In ASPLOS, 2009. Google Scholar
Digital Library
- C.-S. Park and K. Sen. Randomized active atomicity violation detection in concurrent programs. In FSE, 2008. Google Scholar
Digital Library
- S. Park, S. Lu, and Y. Zhou. Ctrigger: Exposing atomicity violation bugs from their finding places. In ASPLOS, 2009. Google Scholar
Digital Library
- S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, and S. Lu. Pres: probabilistic replay with execution sketching onmultiprocessors. In SOSP, 2009. Google Scholar
Digital Library
- S. Qadeer and D. Wu. Kiss: keep it simple and sequential. In PLDI, 2004. Google Scholar
Digital Library
- C. J. Rossbach, O. S. Hofmann, and E. Witchel. Is transactional programming actually easier? In WDDD, 2009.Google Scholar
- O. Ruwase and M. Lam. Cred: A practical dynamic buffer overflow detector. In NDSS, 2004.Google Scholar
- C. Sadowski, S. N. Freund, and C. Flanagan. Singletrack: A dynamic determinism checker for multithreaded programs. In ESOP, 2009. Google Scholar
Digital Library
- S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A dynamic data race detector for multithreaded programs. ACM TOCS, 1997. Google Scholar
Digital Library
- SecurityFocus. Software bug contributed to blackout. http://www.securityfocus.com/news/8016.Google Scholar
- K. Sen. Race directed random testing of concurrent programs. In PLDI, 2008. Google Scholar
Digital Library
- K. Sen and G. Agha. Automated systematic testing of open distributed programs. In FSE, 2006. Google Scholar
Digital Library
- M. Sullivan and R. Chillarege. A comparison of software defects in database management systems and operating systems. In FTCS, 1992.Google Scholar
Cross Ref
- N. Sumner and X. Zhang. Algorithms for automatically computing the causal paths of failures. In Fundamental Approaches to Software Engineering, 2009. Google Scholar
Digital Library
- M. Vaziri, F. Tip, and J. Dolby. Associating synchronization constraints with data in an object-oriented language. In POPL, 2006. Google Scholar
Digital Library
- S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In ISCA, 1995. Google Scholar
Digital Library
- M. Xu, R. Bodík, and M. D. Hill. A serializability violation detector for shared-memory server programs. In PLDI, 2005. Google Scholar
Digital Library
- J. Yu and S. Narayanasamy. A case for an interleaving constrained shared-memory multi-processor. In ISCA, 2009. Google Scholar
Digital Library
- Y. Yu, T. Rodeheffer, and W. Chen. Racetrack: Efficient detection of data race conditions via adaptive tracking. In SOSP, 2005. Google Scholar
Digital Library
- Z. Li et. al. Have things changed now? -- an empirical study of bug characteristics in modern open source software. In ASID workshop in ASPLOS, 2006. Google Scholar
Digital Library
Index Terms
ConMem: detecting severe concurrency bugs through an effect-oriented approach
Recommendations
ConSeq: detecting concurrency bugs through sequential errors
ASPLOS XVI: Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systemsConcurrency bugs are caused by non-deterministic interleavings between shared memory accesses. Their effects propagate through data and control dependences until they cause software to crash, hang, produce incorrect output, etc. The lifecycle of a bug ...
ConMem: detecting severe concurrency bugs through an effect-oriented approach
ASPLOS '10Multicore technology is making concurrent programs increasingly pervasive. Unfortunately, it is difficult to deliver reliable concurrent programs, because of the huge and non-deterministic interleaving space. In reality, without the resources to ...
ConMem: Detecting Crash-Triggering Concurrency Bugs through an Effect-Oriented Approach
Multicore technology is making concurrent programs increasingly pervasive. Unfortunately, it is difficult to deliver reliable concurrent programs, because of the huge and nondeterministic interleaving space. In reality, without the resources to ...







Comments