article

Threads cannot be implemented as a library

Published:12 June 2005Publication History

Abstract

In many environments, multi-threaded code is written in a language that was originally designed without thread support (e.g. C), to which a library of threading primitives was subsequently added. There appears to be a general understanding that this is not the right approach. We provide specific arguments that a pure library approach, in which the compiler is designed independently of threading issues, cannot guarantee correctness of the resulting code.We first review why the approach almost works, and then examine some of the surprising behavior it may entail. We further illustrate that there are very simple cases in which a pure library-based approach seems incapable of expressing an efficient parallel algorithm.Our discussion takes place in the context of C with Pthreads, since it is commonly used, reasonably well specified, and does not attempt to ensure type-safety, which would entail even stronger constraints. The issues we raise are not specific to that context.

References

  1. A. Alexandrescu, H.-J. Boehm, K. Henney, B. Hutchings, D. Lea, and B. Pugh. Memory model for multithreaded C++: Issues. http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1777.pdf.Google ScholarGoogle Scholar
  2. A. Alexandrescu, H.-J. Boehm, K. Henney, D. Lea, and B. Pugh. Memory model for multithreaded C++. http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2004/n1680.pdf.Google ScholarGoogle Scholar
  3. M. Auslander and M. Hopkins. An overview of the PL.8 compiler. In Proceedings of the 1982 SIGPLAN Symposium on Compiler Construction, pages 22--31, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Bechini, P. Foglia, and C. A. Prete. Fine-grain design space exploration for a cartographic SoC multiprocessor. ACM SIGARCH Computer Architecture News (MEDEA Workshop), 31(1):85--92, March 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. N. Bershad, D. D. Redell, and J. R. Ellis. Fast mutual exclusion for uniprocessors. In ASPLOS-V: Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 223--233, October 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H.-J. Boehm. A garbage collector for C and C++. http://www.hpl.hp.com/personal/Hans_Boehm/gc/.Google ScholarGoogle Scholar
  7. H.-J. Boehm. An almost non-blocking stack. In Proceedings of the Twenty-third Annual ACM Symposium on Principles of Distributed Computing, pages 40--49, July 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. A. Buhr. Are safe concurrency libraries possible. Communications of the ACM, 38(2):117--120, February 1995.Google ScholarGoogle Scholar
  9. J. D. Collins, H. Wang, D. M. Tullsen, C. Hughes, Y.-F. Lee, D. Lavery, and J. P. Shen. Speculative precomputation: Long-range prefetching of delinquent loads. In Proceedings of the 28th International Symposium on Computer Architecture, pages 14--15, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. D. Cooper and J. Lu. Register promotion in c programs. In Proceedings of the ACM SIGPLAN 1997 Conference on Programming Language Design and Implementation, pages 308--319, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ericsson Computer Science Laboratory. Open source Erlang. http://www.erlang.org.Google ScholarGoogle Scholar
  12. M. Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1):123--149, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Herlihy. A methodology for implementing highly concurrent data structures. ACM Transactions on Programming Languages and Systems, 15(5):745--770, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Herlihy, V. Luchangco, and M. Moir. Obstruction-free synchronization: Double-ended queues as an example. In Proc. 23rd International Conference on Distributed Computing Systems (ICDCS), pages 522--529, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. HP Technical Brief. Memory ordering optimization considerations. http://h21007.www2.hp.com/dspp/files/unprotected/ddk/Optmiztn.pdf.Google ScholarGoogle Scholar
  16. IEEE and The Open Group. IEEE Standard 1003.1-2001. IEEE, 2001.Google ScholarGoogle Scholar
  17. JSR 133 Expert Group. Jsr-133: Java memory model and thread specification. http://www.cs.umd.edu~pugh/java/memoryModel/jsr133.pdf, August 2004.Google ScholarGoogle Scholar
  18. P. Keleher, A. L. Cox, and W. Zwaenepoel. Lazy release consistency for software distributed shared memory. In Proceedings of the 19th Annual Symposium on Computer Architecture (ISCA'92), pages 13--21, May 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computing, C-28(9):690--691, 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Lea. Concurrency jsr-166 interest site. http://gee.cs.oswego.edu/dl/concurrency-interest.Google ScholarGoogle Scholar
  21. D. Lea. The JSR-133 cookbook for compiler writers. http://gee.cs.oswego.edu/dl/jmm/cookbook.html.Google ScholarGoogle Scholar
  22. R. Lo, F. Chow, R. Kennedy, S.-M. Liu, and P. Tu. Register promotion by sparse partial redundancy elimination of loads and stores. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, pages 26--37, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Manson, W. Pugh, and S. Adve. The java memory model. In Conference Record of the Thirty-Second Annual ACM Symposium on Principles of Programming Languages, pages 378--391, January 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. M. Michael. Scalable lock-free dynamic memory allocation. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation, pages 35--46, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Pugh. The "double-checked locking is broken" declaration. http://www.cs.umd.edu~pugh/java/memoryModel/DoubleCheckedLocking.html.Google ScholarGoogle Scholar
  26. B. Pugh. The java memory model. http://www.cs.umd.edu/~pugh/java/memoryModel/.Google ScholarGoogle Scholar
  27. W. Pugh. The java memory model is fatally flawed. Concurrency - Practice and Experience, 12(6):445--455, 2000.Google ScholarGoogle Scholar
  28. J. H. Reppy. Cml: A higher-order concurrent language. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, pages 293--305, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. L. Rosenfield. A case study in programming for parallel processors. Communications of the ACM, 12(12):645--655, December 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. V. Sarkar. Determining average program execution times and their variance. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, Oregon, January 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. V. S. Sastry and R. D. C. Ju. A new algorithm for scalar register promotion based on ssa form. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, pages 15--25, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. N. Shavit and D. Touitou. Software transactional memory. In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing, pages 204--213, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Terekhov and D. Butenhof. The austin common standards revision group: Enhancement request 9 (austin/107): Clarification of "memory location". http://www.opengroup.org/austin/docs/austin_107.txt, May 2002.Google ScholarGoogle Scholar
  34. The MPI Forum. The message passing interface (MPI) standard. http://www-unix.mcs.anl.gov/mpi/.Google ScholarGoogle Scholar
  35. R. Treiber. Systems programming: Coping with parallelism. Technical Report RJ5118, IBM Almaden Research Center, 1986.Google ScholarGoogle Scholar
  36. Y. Wu and J. R. Larus. Static branch frequency and program profile analysis. In Proceedings of the 27th Annual International Symposium on Microarchitecture, pages 1--11, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Threads cannot be implemented as a library

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 40, Issue 6
        Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
        June 2005
        325 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/1064978
        Issue’s Table of Contents
        • cover image ACM Conferences
          PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
          June 2005
          338 pages
          ISBN:1595930566
          DOI:10.1145/1065010

        Copyright © 2005 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 June 2005

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!