ABSTRACT
This paper presents a randomized scheduler for finding concurrency bugs. Like current stress-testing methods, it repeatedly runs a given test program with supplied inputs. However, it improves on stress-testing by finding buggy schedules more effectively and by quantifying the probability of missing concurrency bugs. Key to its design is the characterization of the depth of a concurrency bug as the minimum number of scheduling constraints required to find it. In a single run of a program with n threads and k steps, our scheduler detects a concurrency bug of depth d with probability at least 1/nkd-1. We hypothesize that in practice, many concurrency bugs (including well-known types such as ordering errors, atomicity violations, and deadlocks) have small bug-depths, and we confirm the efficiency of our schedule randomization by detecting previously unknown and known concurrency bugs in several production-scale concurrent programs.
- Y. Ben-Asher, Y. Eytani, E. Farchi, and S. Ur. Producing scheduling that causes concurrent programs to fail. In PADTAD, pages 37--40, 2006. Google Scholar
Digital Library
- S. Burckhardt et al. A randomized scheduler with probabilistic guarantees of finding bugs. Technical Report MSR-TR-2010-3, Microsoft Research, 2010.Google Scholar
Digital Library
- R. H. Carver and K.-C. Tai. Replay and testing for concurrent programs. IEEE Softw., 8 (2): 66--74, 1991. ISSN 0740-7459. http://dx.doi.org/10.1109/52.73751. Google Scholar
Digital Library
- J.-D. Choi and A. Zeller. Isolating failure-inducing thread schedules. In ISSTA, pages 210--220, 2002. Google Scholar
Digital Library
- J. Devietti, B. Lucia, L. Ceze, and M. Oskin. DMP: deterministic shared memory multiprocessing. In ASPLOS, 2009. Google Scholar
Digital Library
- G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. ReVirt: Enabling intrusion analysis through virtual-machine logging and replay. In OSDI, 2002. Google Scholar
Digital Library
- G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution replay of multiprocessor virtual machines. In VEE 08: Virtual Execution Environments, pages 121--130. ACM, 2008. Google Scholar
Digital Library
- O. Edelstein, E. Farchi, E. Goldin, Y. Nir, G. Ratsaby, and S. Ur. Framework for testing multi-threaded java programs. Concurrency and Computation: Practice and Experience, 15 (3--5): 485--499, 2003.Google Scholar
Cross Ref
- T. Elmas, S. Qadeer, and S. Tasiran. Goldilocks: a race and transaction-aware java runtime. In PLDI, 2007. Google Scholar
Digital Library
- E. Farchi, Y. Nir, and S. Ur. Concurrent bug patterns and how to test them. In IPDPS, page 286, 2003. Google Scholar
Digital Library
- C. Flanagan and S. N. Freund. Fasttrack: efficient and precise dynamic race detection. In PLDI, 2009. Google Scholar
Digital Library
- M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In PLDI, pages 212--223. ACM Press, 1998. Google Scholar
- P. Godefroid. Model checking for programming languages using Verisoft. In phPOPL 97, pages 174--186. ACM Press, 1997. Google Scholar
Digital Library
- J. L. Hellerstein. Achieving service rate objectives with decay usage scheduling. IEEE Trans. Softw. Eng., 19 (8): 813--825, 1993. ISSN 0098-5589. http://dx.doi.org/10.1109/32.238584. Google Scholar
Digital Library
- M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. Technical Report MSR-TR-2006-140, Microsoft Research, 2006.Google Scholar
- P. Joshi, C.-S. Park, K. Sen, and M. Naik. A randomized dynamic program analysis technique for detecting real deadlocks. In PLDI, 2009. Google Scholar
Digital Library
- H. Jula, D. M. Tralamazza, C. Zamfir, and G. Candea. Deadlock immunity: Enabling systems to defend against deadlocks. In OSDI, pages 295--308, 2008. Google Scholar
Digital Library
- T. J. LeBlanc and J. M. Mellor-Crummey. Debugging parallel programs with instant replay. IEEE Trans. Comput., 36 (4): 471--482, 1987. ISSN 0018-9340. http://dx.doi.org/10.1109/TC.1987.1676929. Google Scholar
Digital Library
- S. Lu, J. Tucek, F. Qin, and Y. Zhou. Avio: Detecting atomicity violations via access-interleaving invariants. IEEE Micro, 27 (1): 26--35, 2007. Google Scholar
Digital Library
- S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In ASPLOS, 2008. Google Scholar
Digital Library
- M. Musuvathi and S. Qadeer. Iterative context bounding for systematic testing of multithreaded programs. In PLDI, 2007. Google Scholar
Digital Library
- M. Musuvathi, S. Qadeer, T. Ball, G. Basler, P. A. Nainar, and I. Neamtiu. Finding and reproducing heisenbugs in concurrent programs. In OSDI, 2008. Google Scholar
Digital Library
- A. Muzahid, D. Suárez, S. Qi, and J. Torrellas. Sigrace: signature-based data race detection. In ISCA, 2009. Google Scholar
Digital Library
- M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: efficient deterministic multithreading in software. In ASPLOS, 2009. Google Scholar
Digital Library
- S. Park, S. Lu, and Y. Zhou. CTrigger: exposing atomicity violation bugs from their hiding places. In ASPLOS, 2009. Google Scholar
Digital Library
- K. Sen. Effective random testing of concurrent programs. In ASE, 2007. Google Scholar
Digital Library
- M. Xu, R. Bodik, and M. D. Hill. A "flight data recorder" for enabling full-system multiprocessor deterministic replay. In ISCA, 2003. Google Scholar
Digital Library
- M. Xu, R. Bodík, and M. D. Hill. A serializability violation detector for shared-memory server programs. In PLDI, 2005. Google Scholar
Digital Library
- J. Yu and S. Narayanasamy. A case for an interleaving constrained shared-memory multi-processor. In ISCA, 2009. Google Scholar
Digital Library
Index Terms
A randomized scheduler with probabilistic guarantees of finding bugs
Recommendations
A randomized scheduler with probabilistic guarantees of finding bugs
ASPLOS '10This paper presents a randomized scheduler for finding concurrency bugs. Like current stress-testing methods, it repeatedly runs a given test program with supplied inputs. However, it improves on stress-testing by finding buggy schedules more ...
A randomized scheduler with probabilistic guarantees of finding bugs
ASPLOS '10This paper presents a randomized scheduler for finding concurrency bugs. Like current stress-testing methods, it repeatedly runs a given test program with supplied inputs. However, it improves on stress-testing by finding buggy schedules more ...








Comments