Abstract
In many environments, multi-threaded code is written in a language that was originally designed without thread support (e.g. C), to which a library of threading primitives was subsequently added. There appears to be a general understanding that this is not the right approach. We provide specific arguments that a pure library approach, in which the compiler is designed independently of threading issues, cannot guarantee correctness of the resulting code.We first review why the approach almost works, and then examine some of the surprising behavior it may entail. We further illustrate that there are very simple cases in which a pure library-based approach seems incapable of expressing an efficient parallel algorithm.Our discussion takes place in the context of C with Pthreads, since it is commonly used, reasonably well specified, and does not attempt to ensure type-safety, which would entail even stronger constraints. The issues we raise are not specific to that context.
References
- A. Alexandrescu, H.-J. Boehm, K. Henney, B. Hutchings, D. Lea, and B. Pugh. Memory model for multithreaded C++: Issues. http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1777.pdf.Google Scholar
- A. Alexandrescu, H.-J. Boehm, K. Henney, D. Lea, and B. Pugh. Memory model for multithreaded C++. http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2004/n1680.pdf.Google Scholar
- M. Auslander and M. Hopkins. An overview of the PL.8 compiler. In Proceedings of the 1982 SIGPLAN Symposium on Compiler Construction, pages 22--31, 1982. Google Scholar
Digital Library
- A. Bechini, P. Foglia, and C. A. Prete. Fine-grain design space exploration for a cartographic SoC multiprocessor. ACM SIGARCH Computer Architecture News (MEDEA Workshop), 31(1):85--92, March 2003. Google Scholar
Digital Library
- B. N. Bershad, D. D. Redell, and J. R. Ellis. Fast mutual exclusion for uniprocessors. In ASPLOS-V: Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 223--233, October 1992. Google Scholar
Digital Library
- H.-J. Boehm. A garbage collector for C and C++. http://www.hpl.hp.com/personal/Hans_Boehm/gc/.Google Scholar
- H.-J. Boehm. An almost non-blocking stack. In Proceedings of the Twenty-third Annual ACM Symposium on Principles of Distributed Computing, pages 40--49, July 2004. Google Scholar
Digital Library
- P. A. Buhr. Are safe concurrency libraries possible. Communications of the ACM, 38(2):117--120, February 1995.Google Scholar
- J. D. Collins, H. Wang, D. M. Tullsen, C. Hughes, Y.-F. Lee, D. Lavery, and J. P. Shen. Speculative precomputation: Long-range prefetching of delinquent loads. In Proceedings of the 28th International Symposium on Computer Architecture, pages 14--15, 2001. Google Scholar
Digital Library
- K. D. Cooper and J. Lu. Register promotion in c programs. In Proceedings of the ACM SIGPLAN 1997 Conference on Programming Language Design and Implementation, pages 308--319, 1997. Google Scholar
Digital Library
- Ericsson Computer Science Laboratory. Open source Erlang. http://www.erlang.org.Google Scholar
- M. Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1):123--149, 1991. Google Scholar
Digital Library
- M. Herlihy. A methodology for implementing highly concurrent data structures. ACM Transactions on Programming Languages and Systems, 15(5):745--770, 1993. Google Scholar
Digital Library
- M. Herlihy, V. Luchangco, and M. Moir. Obstruction-free synchronization: Double-ended queues as an example. In Proc. 23rd International Conference on Distributed Computing Systems (ICDCS), pages 522--529, 2003. Google Scholar
Digital Library
- HP Technical Brief. Memory ordering optimization considerations. http://h21007.www2.hp.com/dspp/files/unprotected/ddk/Optmiztn.pdf.Google Scholar
- IEEE and The Open Group. IEEE Standard 1003.1-2001. IEEE, 2001.Google Scholar
- JSR 133 Expert Group. Jsr-133: Java memory model and thread specification. http://www.cs.umd.edu~pugh/java/memoryModel/jsr133.pdf, August 2004.Google Scholar
- P. Keleher, A. L. Cox, and W. Zwaenepoel. Lazy release consistency for software distributed shared memory. In Proceedings of the 19th Annual Symposium on Computer Architecture (ISCA'92), pages 13--21, May 1992. Google Scholar
Digital Library
- L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computing, C-28(9):690--691, 1979.Google Scholar
Digital Library
- D. Lea. Concurrency jsr-166 interest site. http://gee.cs.oswego.edu/dl/concurrency-interest.Google Scholar
- D. Lea. The JSR-133 cookbook for compiler writers. http://gee.cs.oswego.edu/dl/jmm/cookbook.html.Google Scholar
- R. Lo, F. Chow, R. Kennedy, S.-M. Liu, and P. Tu. Register promotion by sparse partial redundancy elimination of loads and stores. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, pages 26--37, 1998. Google Scholar
Digital Library
- J. Manson, W. Pugh, and S. Adve. The java memory model. In Conference Record of the Thirty-Second Annual ACM Symposium on Principles of Programming Languages, pages 378--391, January 2005. Google Scholar
Digital Library
- M. M. Michael. Scalable lock-free dynamic memory allocation. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation, pages 35--46, 2004. Google Scholar
Digital Library
- B. Pugh. The "double-checked locking is broken" declaration. http://www.cs.umd.edu~pugh/java/memoryModel/DoubleCheckedLocking.html.Google Scholar
- B. Pugh. The java memory model. http://www.cs.umd.edu/~pugh/java/memoryModel/.Google Scholar
- W. Pugh. The java memory model is fatally flawed. Concurrency - Practice and Experience, 12(6):445--455, 2000.Google Scholar
- J. H. Reppy. Cml: A higher-order concurrent language. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, pages 293--305, 1991. Google Scholar
Digital Library
- J. L. Rosenfield. A case study in programming for parallel processors. Communications of the ACM, 12(12):645--655, December 1969. Google Scholar
Digital Library
- V. Sarkar. Determining average program execution times and their variance. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, Oregon, January 1989. Google Scholar
Digital Library
- A. V. S. Sastry and R. D. C. Ju. A new algorithm for scalar register promotion based on ssa form. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, pages 15--25, 1998. Google Scholar
Digital Library
- N. Shavit and D. Touitou. Software transactional memory. In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing, pages 204--213, 1995. Google Scholar
Digital Library
- A. Terekhov and D. Butenhof. The austin common standards revision group: Enhancement request 9 (austin/107): Clarification of "memory location". http://www.opengroup.org/austin/docs/austin_107.txt, May 2002.Google Scholar
- The MPI Forum. The message passing interface (MPI) standard. http://www-unix.mcs.anl.gov/mpi/.Google Scholar
- R. Treiber. Systems programming: Coping with parallelism. Technical Report RJ5118, IBM Almaden Research Center, 1986.Google Scholar
- Y. Wu and J. R. Larus. Static branch frequency and program profile analysis. In Proceedings of the 27th Annual International Symposium on Microarchitecture, pages 1--11, 1994. Google Scholar
Digital Library
Index Terms
Threads cannot be implemented as a library






Comments