skip to main content
research-article

Analyzing multicore dumps to facilitate concurrency bug reproduction

Published:13 March 2010Publication History
Skip Abstract Section

Abstract

Debugging concurrent programs is difficult. This is primarily because the inherent non-determinism that arises because of scheduler interleavings makes it hard to easily reproduce bugs that may manifest only under certain interleavings. The problem is exacerbated in multi-core environments where there are multiple schedulers, one for each core. In this paper, we propose a reproduction technique for concurrent programs that execute on multi-core platforms. Our technique performs a lightweight analysis of a failing execution that occurs in a multi-core environment, and uses the result of the analysis to enable reproduction of the bug in a single-core system, under the control of a deterministic scheduler.

More specifically, our approach automatically identifies the execution point in the re-execution that corresponds to the failure point. It does so by analyzing the failure core dump and leveraging a technique called execution indexing that identifies a related point in the re-execution. By generating a core dump at this point, and comparing the differences betwen the two dumps, we are able to guide a search algorithm to efficiently generate a failure inducing schedule. Our experiments show that our technique is highly effective and has reasonable overhead.

References

  1. A. R. Alameldeen and D. A. Wood. Addressing Workload Variability in Architectural Simulations. In IEEE Micro, 23(6):94--98, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Altekar and I. Stoica. ODR: Output-Deterministic Replay for Multicore Debugging. In SOSP, pages 193--206, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Ayers, R. Schooler, C. Metcalf, A. Agarwal, J. Rhee, and E. Witchel. Traceback: First Fault Diagnosis by Reconstruction of Distributed Control Flow. In PLDI, pages 201--212, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Bhansali, W.-K. Chen, S. de Jong, A. Edwards, R. Murray, M. Drinic, D. Mihocka, and J. Chau. Framework for Instruction-Level Tracing and Analysis of Program Executions. In VEE, pages 154--163, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. J. Boehm and M. Weiser. Garbage Collection in an Uncooperative Environment. In Software Practice and Experience, 18(9):807--820, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. D. Bond and K. S. McKinley. Probabilistic Calling Context. In OOPSLA, pages 97--112, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J.-D. Choi and H. Srinivasan. Deterministic Replay of Java Multi-threaded Applications. In SIGMETRICS, pages 48--59, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution Replay of Multiprocessor Virtual Machines. In VEE, pages 121--130, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Ferrante, K. J. Ottenstein, and J. D. Warren. The Program Dependence Graph and its Use in Optimization. ACM Transactions on Programming Languages and Systems, 9(3):319--349, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving Data Publishing: A Survey on Recent Developments. In ACM Computing Surveys, 2009.Google ScholarGoogle Scholar
  11. Z. Guo, X. Wang, J. Tang, X. Liu, Z. Xu, M. Wu, M. F. Kaashoek, and Z. Zhang. R2: An Application-Level Kernel for Record and Replay. In OSDI, pages 193--208, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. R. Hower and M. D. Hill. Rerun: Exploiting Episodes for Lightweight Memory Race Recording. In ISCA, pages 265--276, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Joshi, C. S. Park, K. Sen, and M. Naik. A Randomized Dynamic Program Analysis Technique for Detecting Real Deadlocks. In PLDI, pages 110--120, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. T. King, G. W. Dunlap, and P. M. Chen. Debugging Operating Systems with Time-Traveling Virtual Machines. In USENIX, pages 1--15, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Korel and J. Laski. Dynamic Program Slicing. In Information Processing Letters, 29(3):155--163, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Montesinos, M. Hicks, S. T. King, and J. Torrellas. Capo: A Software-Hardware Interface for Practical Deterministic Multiprocessor Replay. In ASPLOS, pages 73--84, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Musuvathi and S. Qadeer. Iterative Context Bounding for Systematic Testing of Multithreaded Programs. In PLDI, pages 446--455, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Narayanasamy, C. Pereira, and B. Calder. Recording Shared Memory Dependencies Using Strata. In ASPLOS, pages 229--240, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. Nethercote and J. Seward. Valgrind: A Framework for Heavy-weight Dynamic Binary Instrumentation. In PLDI, pages 89--100, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. H. B. Netzer and M. H. Weaver. Optimal Tracing and Incremental Reexecution for Debugging Long-Running Programs. In PLDI, pages 313--325, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Z. Pan and M. A. Linton. Supporting Reverse Execution for Parallel Programs. In SIGPLAN and SIGOPS Workshop on Parallel and Distributed Debugging, pages 124--129, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Park, S. Lu, and Y. Zhou. Ctrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. In ASPLOS, pages 25--36, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Park, W. Xiong, Z. Yin, R. Kaushik, K. Lee, S. Lu, and Y. Zhou. Do You Have to Reproduce the Bug at the First Replay Attempt? -- pres: Probabilistic Replay with Execution Sketching on Multiprocessors. In SOSP, pages 177--192, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Ronsse, K. D. Bosschere, M. Christiaens, J. C. d. Kergommeaux, and D. Kranzlmüller. Record/Replay for Nondeterministic Program Executions. In Communcation of the ACM, 46(9):62--67, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Saito. Jockey: A User-Space Library for Record-Replay Debugging. In Automated Analysis--Driven Debugging, pages 69--76, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. , S. Sarkar, P. Sewell, F.Z. Nardelli, S. Owens, T. Ridge, T. Braibant, M. Myreen, and J. Aglave The Semantics of x86-CC Multiprocessor Machine Code In POPL, pages 379--391, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Sen. Race Directed Random Testing of Concurrent Programs. In PLDI, pages 11--21, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. M. Srinivasan, S. Kandula, C. R. Andrews, and Y. Zhou. Flashback: A Lightweight Extension For Rollback and Deterministic Replay for Software Debugging. In USENIX, pages 29--44, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. B. Xin, N. Sumner, and X. Zhang. Efficient Program Execution Indexing. In PLDI, pages 238--249, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Zhang, R. Gupta, and Y. Zhang. Cost and Precision Tradeoffs of Dynamic Data Slicing Algorithms. ACM Transactions on Programming Languages and Systems, 27(4):631--661, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Analyzing multicore dumps to facilitate concurrency bug reproduction

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 38, Issue 1
      ASPLOS '10
      March 2010
      399 pages
      ISSN:0163-5964
      DOI:10.1145/1735970
      Issue’s Table of Contents
      • cover image ACM Conferences
        ASPLOS XV: Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systems
        March 2010
        422 pages
        ISBN:9781605588391
        DOI:10.1145/1736020
        • General Chair:
        • James C. Hoe,
        • Program Chair:
        • Vikram S. Adve

      Copyright © 2010 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 March 2010

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!