Abstract
Today, Java is regularly used to implement large multi-threaded server-class applications that use locks to protect access to shared data. However, understanding the impact of locks on the performance of a system is complex, and thus the use of locks can impede the progress of threads on configurations that were not anticipated by the developer, during specific phases of the execution.
In this paper, we propose Free Lunch, a new lock profiler for Java application servers, specifically designed to identify, in-vivo, phases where the progress of the threads is impeded by a lock. Free Lunch is designed around a new metric, critical section pressure (CSP), which directly correlates the progress of the threads to each of the locks. Using Free Lunch, we have identified phases of high CSP, which were hidden with other lock profilers, in the distributed Cassandra NoSQL database and in several applications from the DaCapo 9.12, the SPECjvm2008 and the SPECjbb2005 benchmark suites. Our evaluation of Free Lunch shows that its overhead is never greater than 6%, making it suitable for in-vivo use.
- B. Alpern, S. Augart, S. M. Blackburn, M. Butrico, A. Cocchi, P. Cheng, J. Dolby, S. Fink, D. Grove, M. Hind, K. S. McKinley, M. Mergen, J. E. B. Moss, T. Ngo, V. Sarkar, and M. Trapp. The Jikes Research Virtual Machine project: Building an open source research community. IBM System Journal, 2005. Google Scholar
Digital Library
- E. Altman, M. Arnold, S. Fink, and N. Mitchell. Performance analysis of idle programs. In OOPSLA, pages 739--753, 2010. Google Scholar
Digital Library
- Apache Tomcat web page. http://tomcat.apache.org/, 2014.Google Scholar
- D. F. Bacon, R. Konuru, C. Murthy, and M. Serrano. Thin locks: featherweight synchronization for Java. In PLDI, pages 258--268, 1998. Google Scholar
Digital Library
- VanDrunen, von Dincklage, and Wiedermann}oopsla/06/blackburn/dacapoS. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA, pages 169--190, 2006. Google Scholar
Digital Library
- R. Bryant and J. Hawkes. Lockmeter: Highly-informative instrumentation for spin locks in the Linux kernel. In 4th Annual Linux Showcase & Conference, pages 271--282, 2000. Google Scholar
Digital Library
- F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. In OSDI, pages 205--218, 2006. Google Scholar
Digital Library
- G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In SOSP, pages 205--220, 2007. Google Scholar
Digital Library
- J. Demme and S. Sethumadhavan. Rapid identification of architectural bottlenecks via precise event counting. In ISCA, 2011. Google Scholar
Digital Library
- J. Dongarra, K. London, S. Moore, P. Mucci, D. Terpstra, H. You, and M. Zhou. Experiences and lessons learned with a portable interface to hardware performance counters. In IPDPS. IEEE, 2003. Google Scholar
Digital Library
- t al.(2013)Du Bois, Sartor, Eyerman, and Eeckhout}oopsla/13/DuBois/bottle-graphsK. Du Bois, J. B. Sartor, S. Eyerman, and L. Eeckhout. Bottle graphs: visualizing scalability bottlenecks in multi-threaded applications. In OOPSLA, pages 355--372, 2013. Google Scholar
Digital Library
- N. Geoffray, G. Thomas, J. Lawall, G. Muller, and B. Folliot. VMKit: A substrate for managed runtime environments. In VEE, pages 51--62, 2010. Google Scholar
Digital Library
- K. Glerum, K. Kinshumann, S. Greenberg, G. Aul, V. Orgovan, G. Nichols, D. Grant, G. Loihle, and G. Hunt. Debugging in the (very) large: ten years of implementation and experience. In SOSP, pages 103--116, 2009. Google Scholar
Digital Library
- J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java#8482; language specification. Addison-Wesley, 3rd edition, 2005. Google Scholar
Digital Library
- H2 web page. http://www.h2database.com/, 2014.Google Scholar
- Healthcenter. IBM Health Center. http://www.ibm.com/developerworks/java/jdk/tools/healthcenter/, 2014.Google Scholar
- rofiling tool()}url/hprofHPROF: A heap/CPU profiling tool. http://docs.oracle.com/javase/7/docs/technotes/samples/hprof.html, 2014.Google Scholar
- Y. Huang, Z. Cui, L. Chen, W. Zhang, Y. Bao, and M. Chen. HaLock: hardware-assisted lock contention detection in multithreaded applications. In PACT, pages 253--262, 2012. Google Scholar
Digital Library
- H. Inoue and T. Nakatani. How a Java VM can get more from a hardware performance monitor. In OOPSLA, pages 137--154, 2009. Google Scholar
Digital Library
- Java Lock Analyzer. JLA homepage. http://publib.boulder.ibm.com/infocenter/javasdk/tools/index.jsp?topic=%2Fcom.ibm.java.doc.igaa%2F_1vg0001143f2181--11a9b04924e-7ff9_1001.html, 2014.Google Scholar
- JBoss web page. https://www.jboss.org/overview/, 2014.Google Scholar
- R. Jones, A. Hosking, and E. Moss. The garbage collection handbook: the art of automatic memory management. Chapman & Hall/CRC, 1st edition, 2011. Google Scholar
Digital Library
- JProfiler home page. http://www.ej-technologies.com/products/jprofiler/overview.html, 2014.Google Scholar
- JVMTI. Java#8482; Virtual Machine Tool Interface.newlinehttp://docs.oracle.com/javase/6/docs/technotes/guides/jvmti/, 2014.Google Scholar
- T. Kalibera, M. Mole, R. Jones, and J. Vitek. A black-box approach to understanding concurrency in DaCapo. In OOPSLA, pages 335--354, 2012. Google Scholar
Digital Library
- A. Lakshman and P. Malik. Cassandra: Structured storage system on a P2P network. In PODC, 2009. Google Scholar
Digital Library
- D. Lea. The java.util.concurrent synchronizer framework. Sci. Comput. Program., pages 293--309, 2005. Google Scholar
Digital Library
- J.-P. Lozi, F. David, G. Thomas, J. Lawall, and G. Muller. Remote Core Locking: migrating critical-section execution to improve the performance of multithreaded applications. In ATC, pages 65--76. USENIX, 2012. Google Scholar
Digital Library
- J. Manson, W. Pugh, and S. V. Adve. The Java memory model. In POPL, pages 378--391, 2005. Google Scholar
Digital Library
- M. Milenkovic, S. Jones, F. Levine, and E. Pineda. Performance inspector tools with instruction tracing and per-thread / function profiling. In Linux Symposium, 2008.Google Scholar
- N. Mitchell and P. F. Sweeney. On-the-fly capacity planning. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA, pages 849--866, 2013. Google Scholar
Digital Library
- Multicore SDK.newlinehttps://www.ibm.com/developerworks/mydeveloperworks/groups/service/html/communityview?communityUuid=9a29d9f0--11b1--4d29--9359-a6fd9678a2e8, 2014.Google Scholar
- Mutrace. Measuring Lock Contention. http://0pointer.de/blog/projects/mutrace.html, 2014.Google Scholar
- z, Gibson, Fuchs, and Rinaldi}socc/11/patil/ycsbS. Patil, M. Polte, K. Ren, W. Tantisiriroj, L. Xiao, J. López, G. Gibson, A. Fuchs, and B. Rinaldi. YCSBGoogle Scholar
- : Benchmarking and performance debugging advanced features in scalable table stores. In SoCC. ACM, 2011.Google Scholar
- Safepoints in Hotspot. http://blog.ragozin.info/2012/10/safepoints-in-hotspot, 2014.Google Scholar
- SPECjbb2005. http://www.spec.org/jbb2005/, 2014.Google Scholar
- SPECjvm2008. http://www.spec.org/jvm2008/, 2014.Google Scholar
- N. R. Tallent, J. M. Mellor-Crummey, and A. Porterfield. Analyzing lock contention in multithreaded applications. In PPoPP, pages 269--280, 2010. Google Scholar
Digital Library
- F. Xian, W. Srisa-an, and H. Jiang. Contention-aware scheduler: unlocking execution parallelism in multithreaded Java programs. In OOPSLA, pages 163--180, 2008. Google Scholar
Digital Library
- W. Xiong, S. Park, J. Zhang, Y. Zhou, and Z. Ma. Ad hoc synchronization considered harmful. In OSDI, pages 1--8. USENIX, 2010. Google Scholar
Digital Library
- Yourkit. Yourkit home page. http://www.yourkit.com/, 2014.Google Scholar
- Y. Yu, T. Rodeheffer, and W. Chen. Racetrack: Efficient detection of data race conditions via adaptive tracking. In SOSP, pages 221--234, 2005. Google Scholar
Digital Library
Index Terms
Continuously measuring critical section pressure with the free-lunch profiler
Recommendations
Continuously measuring critical section pressure with the free-lunch profiler
OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & ApplicationsToday, Java is regularly used to implement large multi-threaded server-class applications that use locks to protect access to shared data. However, understanding the impact of locks on the performance of a system is complex, and thus the use of locks ...
Pessimistic software lock-elision
DISC'12: Proceedings of the 26th international conference on Distributed ComputingRead-write locks are one of the most prevalent lock forms in concurrent applications because they allow read accesses to locked code to proceed in parallel. However, they do not offer any parallelism between reads and writes.
This paper introduces ...
Fast and Portable Locking for Multicore Architectures
The scalability of multithreaded applications on current multicore systems is hampered by the performance of lock algorithms, due to the costs of access contention and cache misses. The main contribution presented in this article is a new locking ...







Comments