Abstract
Reproducing concurrency bugs is a prominent challenge. Existing techniques either rely on recording very fine grained execution information and hence have high runtime overhead, or strive to log as little information as possible but provide no guarantee in reproducing a bug. We present Light, a technique that features much lower overhead compared to techniques based on fine grained recording, and that guarantees to reproduce concurrent bugs. We leverage and formally prove that recording flow dependences is the necessary and sufficient condition to reproduce a concurrent bug. The flow dependences, together with the thread local orders that can be automatically inferred (and hence not logged), are encoded as scheduling constraints. An SMT solver is used to derive a replay schedule, which is guaranteed to exist even though it may be different from the original schedule. Our experiments show that Light has only 44% logging overhead, almost one order of magnitude lower than the state of the art techniques relying on logging memory accesses. Its space overhead is only 10% of those techniques. Light can also reproduce all the bugs we have collected whereas existing techniques miss some of them.
- G. Altekar and I. Stoica. Odr: Output-deterministic replay for multicore debugging. In SOSP, 2009. Google Scholar
Digital Library
- S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovi´c, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The dacapo benchmarks: Java benchmarking development and analysis. In OOPSLA, 2006. Google Scholar
Digital Library
- M. D. Bond and M. Kulkarni. Tracking conflicting accesses efficiently for software record and replay. Ohio State CSE Technical Report, OSU-CISRC-2/12-TR01, 2012.Google Scholar
- M. D. Bond, M. Kulkarni, M. Cao, M. Zhang, M. Fathi Salmi, S. Biswas, A. Sengupta, and J. Huang. Octet: Capturing and controlling cross-thread dependences efficiently. In OOPSLA, 2013. Google Scholar
Digital Library
- F. Chen and G. Ro¸su. Parametric and sliced causality. In CAV, 2007. Google Scholar
Digital Library
- J.-D. Choi and H. Srinivasan. Deterministic replay of java multithreaded applications. In SPDT, 1998. Google Scholar
Digital Library
- 60% 70% 80% 90% 100% 0% 10% 20% 30% 40% 50% O1 O2 Light (a) 60% 70% 80% 90% 100% 10% 20% 30% 40% 50% 60% O1 O2 Light 0% (b) Figure 7. (1) Breakdown of Time Overhead (2) Breakdown of Space OverheadGoogle Scholar
- T. ¸Serbănu¸tă, F. Chen, and G. Ro¸su. Maximal causal models for sequentially consistent systems. In Runtime Verification. 2013.Google Scholar
Cross Ref
- L. De Moura and N. Bjørner. Z3: An efficient smt solver. In TACAS, 2008. Google Scholar
Digital Library
- D. Devecsery, M. Chow, X. Dou, J. Flinn, and P. M. Chen. Eidetic systems. In OSDI, 2014. Google Scholar
Digital Library
- C. Flanagan and S. N. Freund. Fasttrack: Efficient and precise dynamic race detection. In PLDI, 2009. Google Scholar
Digital Library
- A. Georges, M. Christiaens, M. Ronsse, and K. De Bosschere. Jarec: A portable record/replay environment for multi-threaded java applications. Software Practice and Experience, 34(6), May 2004. Google Scholar
Digital Library
- P. B. Gibbons and E. Korach. Testing shared memories. SIAM J. Comput., 26(4), Aug. 1997. Google Scholar
Digital Library
- C. Hammer, J. Dolby, M. Vaziri, and F. Tip. Dynamic detection of atomic-set-serializability violations. In ICSE, 2008. Google Scholar
Digital Library
- J. Huang, P. Liu, and C. Zhang. LEAP: Lightweight deterministic multi-processor replay of concurrent java programs. In FSE, 2010. Google Scholar
Digital Library
- J. Huang, C. Zhang, and J. Dolby. CLAP: Recording Local Executions to Reproduce Concurrency Failures. In PLDI, 2013. Google Scholar
Digital Library
- J. Huang, P. O. Meredith, and G. Rosu. Maximal sound predictive race detection with control flow abstraction. In PLDI, 2014. Google Scholar
Digital Library
- V. Kahlon and C. Wang. Universal causality graphs: A precise happensbefore model for detecting bugs in concurrent programs. In CAV, 2010. Google Scholar
Digital Library
- Z. Lai, S. C. Cheung, and W. K. Chan. Detecting atomic-set serializability violations in multithreaded programs through active randomized testing. In ICSE, 2010. Google Scholar
Digital Library
- T. J. LeBlanc and J. M. Mellor-Crummey. Debugging parallel programs with instant replay. IEEE Trans. Comput., 36(4), Apr. 1987. Google Scholar
Digital Library
- D. Lee, M. Said, S. Narayanasamy, Z. Yang, and C. Pereira. Offline symbolic analysis for multi-processor execution replay. In MICRO, 2009. Google Scholar
Digital Library
- D. Lee, P. M. Chen, J. Flinn, and S. Narayanasamy. Chimera: Hybrid program analysis for determinism. In PLDI, 2012. Google Scholar
Digital Library
- K. H. Lee, D. Kim, and X. Zhang. Infrastructure-free logging and replay of concurrent execution on multiple cores. In PPoPP, 2014. Google Scholar
Digital Library
- P. Liu and C. Zhang. Pert: The application-aware tailoring of java object persistence. TSE, 38(4), July 2012. Google Scholar
Digital Library
- P. Liu, O. Tripp, and C. Zhang. Grail: Context-aware fixing of concurrency bugs. FSE, 2014. Google Scholar
Digital Library
- C. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. Stamp: Stanford transactional applications for multi-processing. In IEEE International Symposium on Workload Characterization, 2008.Google Scholar
- M. Naik and A. Aiken. Conditional must not aliasing for static race detection. In POPL, 2007. Google Scholar
Digital Library
- S. Owicki and D. Gries. Verifying properties of parallel programs: An axiomatic approach. Commun. ACM, 19(5), May 1976. Google Scholar
Digital Library
- S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, and S. Lu. Pres: Probabilistic replay with execution sketching on multiprocessors. In SOSP, 2009. Google Scholar
Digital Library
- H. Patil, C. Pereira, M. Stallcup, G. Lueck, and J. Cownie. Pinplay: A framework for deterministic replay and reproducible analysis of parallel programs. In CGO, 2010. Google Scholar
Digital Library
- M. Ronsse and K. De Bosschere. Recplay: A fully integrated practical record/replay system. ACM Trans. Comput. Syst., 17(2), May 1999. Google Scholar
Digital Library
- K. Sen, G. Ro¸su, and G. Agha. Detecting errors in multithreaded programs by generalized predictive analysis of executions. In FMOODS, 2005. Google Scholar
Digital Library
- L. A. Smith, J. M. Bull, and J. Obdrzálek. A parallel java grande benchmark suite. In Proceedings of the 2001 ACM/IEEE Conference on Supercomputing, SC, 2001. Google Scholar
Digital Library
- O. Tripp. Incorporating Data Abstractions into Concurrency Control. PhD thesis, Tel-Aviv University, 2014.Google Scholar
- O. Tripp, G. Yorsh, J. Field, and M. Sagiv. Hawkeye: Effective discovery of dataflow impediments to parallelization. In OOPSLA, 2011. Google Scholar
Digital Library
- R. Vallée-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan. Soot - a java bytecode optimization framework. In CASCON, 1999.Google Scholar
Digital Library
- K. Veeraraghavan, D. Lee, B. Wester, J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. Doubleplay: Parallelizing sequential logging and replay. In ASPLOS XVI, 2011. Google Scholar
Digital Library
- D. Weeratunge, X. Zhang, and S. Jagannathan. Analyzing multicore dumps to facilitate concurrency bug reproduction. In ASPLOS XV, 2010. Google Scholar
Digital Library
- Yices. The yices smt solver. http://yices.csl.sri.com/.Google Scholar
- C. Zamfir and G. Candea. Execution synthesis: A technique for automated software debugging. In EuroSys, 2010. Google Scholar
Digital Library
- J. Zhou, X. Xiao, and C. Zhang. Stride: Search-based deterministic replay in polynomial time via bounded linkage. In ICSE, 2012. Google Scholar
Digital Library
Index Terms
Light: replay via tightly bounded recording
Recommendations
Light: replay via tightly bounded recording
PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationReproducing concurrency bugs is a prominent challenge. Existing techniques either rely on recording very fine grained execution information and hence have high runtime overhead, or strive to log as little information as possible but provide no ...
Efficient and deterministic record & replay for actor languages
ManLang '18: Proceedings of the 15th International Conference on Managed Languages & RuntimesWith the ubiquity of parallel commodity hardware, developers turn to high-level concurrency models such as the actor model to lower the complexity of concurrent software. However, debugging concurrent software is hard, especially for concurrency models ...
Software-only system-level record and replay in wireless sensor networks
IPSN '15: Proceedings of the 14th International Conference on Information Processing in Sensor NetworksWireless sensor networks (WSNs) are plagued by the possibility of bugs manifesting only at deployment. However, debugging deployed WSNs is challenging for several reasons---the remote location of deployed sensor nodes, the non- determinism of execution ...






Comments