Abstract
In multithreaded programming, locks are frequently used as a mechanism for synchronization. Because today's operating systems do not consider lock usage as a scheduling criterion, scheduling decisions can be unfavorable to multithreaded applications, leading to performance issues such as convoying and heavy lock contention in systems with multiple processors. Previous efforts to address these issues (e.g., transactional memory, lock-free data structure) often treat scheduling decisions as "a fact of life," and therefore these solutions try to cope with the consequences of undesirable scheduling instead of dealing with the problem directly.
In this paper, we introduce Contention-Aware Scheduler (CA-Scheduler), which is designed to support efficient execution of large multithreaded Java applications in multiprocessor systems. Our proposed scheduler employs a scheduling policy that reduces lock contention. As will be shown in this paper, our prototype implementation of the CA-Scheduler in Linux and Sun HotSpot virtual machine only incurs 3.5% runtime overhead, while the overall performance differences, when compared with a system with no contention awareness, range from a degradation of 3% in a small multithreaded benchmark to an improvement of 15% in a large Java application server benchmark.
- J. Aas. Understanding the Linux 2.6.8.1 Scheduler. On-line article, 2006. http://josh.trancesoftware.com/linux/linux cpu scheduler.pdf.Google Scholar
- T. E. Anderson, B. N. Bershad, E. D. Lazowska, and H. M. Levy. Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism. In Proceedings of ACM Symposium on Operating Systems Principles (SOSP), pages 95--109, New York, NY, 1991. Google Scholar
Digital Library
- M. Arnold, A. Welc, and V. T. Rajan. Improving Virtual Machine Performance Using a Cross-Run Profile Repository. In Proceedings of the ACM SIGPLAN Conference on Object Oriented Programming Systems and Applications (OOPSLA), pages 297--311, San Diego, CA, 2005. Google Scholar
Digital Library
- D. F. Bacon, R. Konuru, C. Murthy, and M. Serrano. Thin Locks: Featherweight Synchronization for Java. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 258--268, Montreal, Quebec, Canada, June 1998. Google Scholar
Digital Library
- J. C. Bezdek, R. Ehrlich, and W. Full. FCM: The fuzzy C-Means Clustering Algorithm. Computers & Geosciences, 10(2-3):191--203, 1984.Google Scholar
- C. M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, November 1995. Google Scholar
Digital Library
- S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. Eliot, B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), pages 169--190, Portland, OR, 2006. Google Scholar
Digital Library
- B. D. Carlstrom, J. Chung, H. Chafi, A. McDonald, C. Cao Minh, L. Hammond, C. Kozyrakis, and K. and Olukotun. Transactional Execution of Java Programs. In OOPSLA 2005 Workshop on Synchronization and Concurrency in Object-Oriented Languages (SCOOL). Oct 2005.Google Scholar
- M. Cohen, S. B. Kooi, and W. Srisa-an. Clustering the Heap in Multi-Threaded Applications for Improved Garbage Collection. In Proceedings of the Conference on Genetic and Evolutionary Computation (GECCO), pages 1901--1908, Seattle, WA, 2006. Google Scholar
Digital Library
- J. C. Dehnert, B. K. Grant, J. P. Banning, R. Johnson, T. Kistler, A. Klaiber, and J. Mattson. The Transmeta Code Morphing Software: Using Speculation, Recovery, and Adaptive Retranslation to Address Real-Life Challenges. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 15--24, San Francisco, CA, 2003. Google Scholar
Digital Library
- R. Dimpsey, R. Arora, and K. Kuiper. Java Server Performance: A Case Study of Building Efficient, Scalable JVMs. IBM Systems Journal, 39(1):151--174, 2000. Google Scholar
Digital Library
- C. Grzegorczyk, S. Soman, C. Krintz, and R. Wolski. Isla Vista Heap Sizing: Using Feedback to Avoid Paging. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 325--340, San Jose, CA, March 2007. Google Scholar
Digital Library
- T. Harris, A. Cristal, O. Unsal, E. Ayguade, F. Gagliardi, B. Smith, and M. Valero. Transactional Memory: An Overview. IEEE Micro, 27(3):8--29, May-June 2007. Google Scholar
Digital Library
- T. Harris, M. Plesko, A. Shinnar, and D. Tarditi. Optimizing Memory Transactions. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 14--25. Ottawa, Ontario, Canada, Jun 2006. Google Scholar
Digital Library
- J. A. Hartigan and M. A. Wong. A K-Means Clustering Algorithm. Applied Statistics, 28:100--108, 1979.Google Scholar
Digital Library
- M. Herlihy and J. E. B. Moss. Transactional Memory: Architectural Support for Lock-Free Data Structures. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 289--300. May 1993. Google Scholar
Digital Library
- HSQL Database Engine. hsqldb. On-Line Documentation, Last visited: December 2007. http://hsqldb.org/web/hsqlFAQ.html.Google Scholar
- IBM. Jikes RVM. http://jikesrvm.sourceforge.net.Google Scholar
- B. D. Marsh, M. L. Scott, T. J. LeBlanc, and E. P. Markatos. First-Class User-Level Threads. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), pages 110--121, New York, NY, 1991. Google Scholar
Digital Library
- Microsoft Corp. Using Microsoft Virtual PC 2007 for Application Compatibility. White Paper, August 2006. http://www.microsoft.com/windows/products/winfamily/virtualpc/appcompat.mspx.Google Scholar
- K. E. Moore, J. Bobba, M. J. Moravan, M. D. Hill, and D. A. Wood. LogTM: Log-Based Transactional Memory. In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), pages 254--265. Feb 2006.Google Scholar
Cross Ref
- D. A. Patterson and J. L. Hennessy. Computer Organization and Design (3rd ed.): the Hardware/Software Interface. Morgan Kaufmann Publishers Inc., San Francisco, CA, 2004. Google Scholar
Digital Library
- R. Rajwar and J. R. Goodman. Speculaive Lock Elision: Enabling Highly Concurrent Multithreaded Execution. In Proceedings of the International Symposium on Microarchitecture (MICRO), pages 294--305, Austin, TX, 2001. Google Scholar
Digital Library
- R. Rajwar and J. R. Goodman. Transactional Lock-Free Execution of Lock-Based Programs. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 5--17, San Jose, CA, 2002. Google Scholar
Digital Library
- C. J. Rossbach, O. S. Hofmann, D. E. Porter, H. E. Ramadan, B. Aditya, and E. Witchel. TxLinux: Using and Managing Hardware Transactional Memory in an Operating System. In Proceedings of ACM Symposium on Operating Systems Principles (SOSP), pages 87--102, New York, NY, 2007. Google Scholar
Digital Library
- Silberschatz and Galvin and Gagne. Operating System Concepts, 7th Edition. Addison Wesley, 2007. Google Scholar
Digital Library
- J. Singer, G. Brown, I. Watson, and J. Cavazos. Intelligent Selection of Application-Specific Garbage Collectors. In Proceedings of the International Symposium on Memory Management (ISMM), pages 91--102, Montréal, Quebec, Canada, 2007. Google Scholar
Digital Library
- S. Soman, C. Krintz, and D. F. Bacon. Dynamic Selection of Application-Specific Garbage Collectors. In Proceedings of the International Symposium on Memory Management (ISMM), pages 49--60, Vancouver, BC, Canada, 2004. Google Scholar
Digital Library
- Standard Performance Evaluation Corporation. SPECjAppServer2004 user's guide. http://www.spec.org.Google Scholar
- Standard Performance Evaluation Corporation. SPECjbb2005. On-Line Documentation, Last visited: July 2007. http://www.spec.org/jbb2005.Google Scholar
- Sun Microsystems. ECPERF. http://java.sun.com/developer/earlyAccess/j2ee/ecperf/download.html.Google Scholar
- D. Tam, R. Azimi, and M. Stumm. Thread Clustering: Sharing-Aware Scheduling on SMP-CMP-SMT Multiprocessors. SIGOPS Operating System Review, 41(3):47--58, 2007. Google Scholar
Digital Library
- A. Tucker and A. Gupta. Process Control and Scheduling Issues for Multiprogrammed Shared-Memory Multiprocessors. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), pages 159--166, New York, NY, 1989. Google Scholar
Digital Library
- A. Tucker, B. Smaalders, D. Singleton, and N. Kosche. US patent 5,937,187: Method and Apparatus for Execution and Preemption Control of Computer Process Entities, 1999.Google Scholar
- V. Uhlig. The Mechanics of In-Kernel Synchronization for a Scalable Microkernel. SIGOPS Operating System Review, 41(4):49--58, 2007. Google Scholar
Digital Library
- F. Xian, W. Srisa-an, and H. Jiang. Allocation-Phase Aware Thread Scheduling Policies to Improve Garbage Collection Performance. In Proceedings of the ACM SIGPLAN International Symposium on Memory Management (ISMM), pages 79--90, Montréal, Quebec, Canada, October 2007. Google Scholar
Digital Library
- T. Yang, E. D. Berger, S. F. Kaplan, and J. E. B. Moss. CRAMM: Virtual Memory Support for Garbage-Collected Applications. In Proceedings of the USENIX Conference on Operating System Design and Implementation (OSDI), pages 103--116, Seattle, WA, November 2006. Google Scholar
Digital Library
Index Terms
Contention-aware scheduler: unlocking execution parallelism in multithreaded java programs
Recommendations
Contention-aware scheduler: unlocking execution parallelism in multithreaded java programs
OOPSLA '08: Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applicationsIn multithreaded programming, locks are frequently used as a mechanism for synchronization. Because today's operating systems do not consider lock usage as a scheduling criterion, scheduling decisions can be unfavorable to multithreaded applications, ...
Scheduling support for transactional memory contention management
PPoPP '10Transactional Memory (TM) is considered as one of the most promising paradigms for developing concurrent applications. TM has been shown to scale well on >multiple cores when the data access pattern behaves "well," i.e., when few conflicts are induced. ...
Scheduling support for transactional memory contention management
PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingTransactional Memory (TM) is considered as one of the most promising paradigms for developing concurrent applications. TM has been shown to scale well on >multiple cores when the data access pattern behaves "well," i.e., when few conflicts are induced. ...







Comments