skip to main content
10.1145/1736020.1736031acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Respec: efficient online multiprocessor replayvia speculation and external determinism

Published:13 March 2010Publication History

ABSTRACT

Deterministic replay systems record and reproduce the execution of a hardware or software system. While it is well known how to replay uniprocessor systems, replaying shared memory multiprocessor systems at low overhead on commodity hardware is still an open problem. This paper presents Respec, a new way to support deterministic replay of shared memory multithreaded programs on commodity multiprocessor hardware. Respec targets online replay in which the recorded and replayed processes execute concurrently.

Respec uses two strategies to reduce overhead while still ensuring correctness: speculative logging and externally deterministic replay. Speculative logging optimistically logs less information about shared memory dependencies than is needed to guarantee deterministic replay, then recovers and retries if the replayed process diverges from the recorded process. Externally deterministic replay relaxes the degree to which the two executions must match by requiring only their system output and final program states match. We show that the combination of these two techniques results in low recording and replay overhead for the common case of data-race-free execution intervals and still ensures correct replay for execution intervals that have data races.

We modified the Linux kernel to implement our techniques. Our software system adds on average about 18% overhead to the execution time for recording and replaying programs with two threads and 55% overhead for programs with four threads.

References

  1. G. Altekar and I. Stoica. ODR: Output-deterministic replay for multicore debugging. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles, October 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. F. Bacon and S. C. Goldstein. Hardware assisted replay of multiprocessor programs. In Proceedings of the 1991 ACM/ONR Workshop on Parallel and Distributed Debugging, pages 194--206. ACM Press, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Bhansali, W. Chen, S. de Jong, A. Edwards, and M. Drinic. Framework for instruction-level tracing and analysis of programs. In Second International Conference on Virtual Execution Environments, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. J. Boehm and S. Adve. Foundations of the c concurrency memory model. In Proceedings of PLDI, pages 68--78. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Boothe. Efficient algorithms for bidirectional debugging. In Proceedings of the ACM SIGPLAN conference on programming language design and implementation, pages 299--310, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. C. Bressoud and F. B. Schneider. Hypervisor-based fault tolerance. ACM Transactions on Computer Systems, 14(1):80--107, February 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. D. Choi, B. Alpern, T. Ngo, and M. Sridharan. A perturbation free replay platform for cross-optimized multithreaded applications. In Proceedings of the 15th International Parallel and Distributed Processing Symposium, April 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Chow, T. Garfinkel, and P. M. Chen. Decoupling dynamic program analysis from execution in virtual environments. In Proceedings of the 2008 USENIX Technical Conference, pages 1--14, June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. ReVirt: Enabling intrusion analysis through virtual-machine logging and replay. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation, pages 211--224, Boston, MA, December 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. W. Dunlap, D. G. Lucchetti, M. Fetterman, and P. M. Chen. Execution replay on multiprocessor virtual machines. In Proceedings of the 2008 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE), pages 121--130, March 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. I. Feldman and C. B. Brown. Igor: a system for program debugging via reversible execution. In PADD '88: Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging, pages 112--123, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Fraser and F. Chang. Operating system I/O speculation: How two invocations are faster than one. In Proceedings of the 2003 USENIX Technical Conference, pages 325--338, San Antonio, TX, June 2003.Google ScholarGoogle Scholar
  14. A. Georges, M. Christiaens, M. Ronsse, and K. D. Bosschere. Jarec: A portable record/replay environment for multi-threaded java applications. In Software: Practice and Experience, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. R. Hower and M. D. Hill. Rerun: Exploiting Episodes for Lightweight memory Race Recording. In Proceedings of the 2008 International Symposium on Computer Architecture, pages 265--276, June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. T. King, G. W. Dunlap, and P. M. Chen. Debugging operating systems with time-traveling virtual machines. In Proceedings of the 2005 USENIX Technical Conference, pages 1--15, April 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. J. LeBlanc and J. M. Mellor-Crummey. Debugging parallel programs with instant replay. IEEE Transaction on Computers, 36(4):471--482, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Lee, M. Said, S. Narayanasamy, Z. J. Yang, and C. Pereira. Offline Symbolic Analysis for Multi-Processor Execution Replay. In International Symposium on Microarchitecture (MICRO), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. E. Lowell, S. Chandra, and P. M. Chen. Exploring failure transparency and the limits of generic recovery. In Proceedings of the 4th Symposium on Operating Systems Design and Implementation, San Diego, CA, October 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Manson, W. Pugh, and S. Adve. The java memory model. In Proceedings of POPL, pages 378--391. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Montesinos, L. Ceze, and J. Torrellas. DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Efficiently . In Proceedings of the 2008 International Symposium on Computer Architecture, pages 289--300, June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Montesinos, M. Hicks, S. T. King, and J. Torrellas. Capo: a software-hardware interface for practical deterministic multiprocessor replay. In Proceedings of the 14th International conference on Architectural support for programming languages and operating systems (ASPLOS), pages 73--84, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Narayanasamy, C. Pereira, and B. Calder. Recording shared memory dependencies using strata. In ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, pages 229--240, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Narayanasamy, C. Pereira, H. Patil, R. Cohn, and B. Calder. Automatic logging of operating system effects to guide application-level architecture simulation. In International Conference on Measurements and Modeling of Computer Systems (SIGMETRICS), June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, and B. Calder. Automatically classifying benign and harmful data races using replay analysis. In PLDI, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. H. B. Netzer. Optimal tracing and replay for debugging shared-memory parallel programs. In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging, pages 1--11, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. E. B. Nightingale, P. M. Chen, and J. Flinn. Speculative execution in a distributed file system. In Proceedings of the 20th ACM Symposium on Operating Systems Principles, pages 191--205, Brighton, United Kingdom, October 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. B. Nightingale, D. Peek, P. M. Chen, and J. Flinn. Parallelizing security checks on commodity hardware. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 308--318, Seattle, WA, March 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. E. B. Nightingale, K. Veeraraghavan, P. M. Chen, and J. Flinn. Rethink the sync. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pages 1--14, Seattle, WA, October 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: efficient deterministic multithreading in software. In Proceedings of the 2009 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Osman, D. Subhraveti, G. Su, and J. Nieh. The design and implementation of Zap: A system for migrating computing environments. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation, pages 361--376, Boston, MA, December 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Park, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, S. Lu, and Y. Zhou. Do you have to reproduce the bug at the first replay attempt? -- PRES: Probabilistic replay with execution sketching on multiprocessors. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles, October 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. Qin, J. Tucek, J. Sundaresan, and Y. Zhou. Rx: Treating bugs as allergies -- a safe method to survive software failures. In Proceedings of the 20th ACM Symposium on Operating Systems Principles, pages 235--248, Brighton, United Kingdom, October 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Ronsse and K. D. Bosschere. RecPlay: A Full Integrated Practical Record/Replay System. ACM Transactions on Computer Systems, 17(2):133--152, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. Sarangi, S. Narayanasamy, B. Carneal, A. Tiwari, B. Calder, and J. Torrellas. Patching processor design errors with programmable hardware. IEEE Micro Top Picks, 27(1):12--25, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. Srinivasan, C. Andrews, S. Kandula, and Y. Zhou. Flashback: A light-weight extension for rollback and deterministic replay for software debugging. In Proceedings of the 2004 USENIX Technical Conference, Boston, MA, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Steven, P. Chandra, B. Fleck, and A. Podgurski. jrapture: A capture replay tool for observation-based testing. In Proceedings of the International Symposium on Software Testing and Analysis, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. Tucek, S. Lu, C. Huang, S. Xanthos, and Y. Zhou. Triage: Diagnosing Production Run Failures at the User's Site. In Proceedings of the 2007 Symposium on Operating Systems Principles, pages 131--144, October 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd International Symposium on Computer Architecture, pages 24--36, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. M. Xu, R. Bodik, and M. D. Hill. A Flight Data Recorder for Enabling Full-system Multiprocessor Deterministic Replay. In Proceedings of the 2003 International Symposium on Computer Architecture, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Xu, M. D. Hill, and R. Bodik. A regulated transitive reduction (RTR) for longer memory race recording. In ASPLOS--XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, pages 49--60, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. M. Xu, V. Malyugin, J. Sheldon, G. Venkitachalam, and B. Weissman. ReTrace: Collecting Execution Trace with Virtual Machine Deterministic Replay. In Proceedings of the 2007 Workshop on Modeling, Benchmarking and Simulation (MoBS), June 2007.Google ScholarGoogle Scholar

Index Terms

  1. Respec: efficient online multiprocessor replayvia speculation and external determinism

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!