skip to main content
research-article

ProteusTM: Abstraction Meets Performance in Transactional Memory

Authors Info & Claims
Published:25 March 2016Publication History
Skip Abstract Section

Abstract

The Transactional Memory (TM) paradigm promises to greatly simplify the development of concurrent applications. This led, over the years, to the creation of a plethora of TM implementations delivering wide ranges of performance across workloads. Yet, no universal implementation fits each and every workload. In fact, the best TM in a given workload can reveal to be disastrous for another one. This forces developers to face the complex task of tuning TM implementations, which significantly hampers their wide adoption. In this paper, we address the challenge of automatically identifying the best TM implementation for a given workload. Our proposed system, ProteusTM, hides behind the TM interface a large library of implementations. Underneath, it leverages a novel multi-dimensional online optimization scheme, combining two popular learning techniques: Collaborative Filtering and Bayesian Optimization.

We integrated ProteusTM in GCC and demonstrate its ability to switch between TMs and adapt several configuration parameters (e.g., number of threads). We extensively evaluated ProteusTM, obtaining average performance <3% from optimal, and gains up to 100x over static alternatives.

References

  1. Allon Adir, Dave Goodman, Daniel Hershcovich, Oz Hershkovitz, Bryan Hickerson, Karen Holtz, Wisam Kadry, Anatoly Koyfman, John Ludden, Charles Meissner, Amir Nahir, Randall R. Pratt, Mike Schiffli, Brett St. Onge, Brian Thompto, Elena Tsanko, and Avi Ziv. Verification of Transactional Memory in POWER8. In Proceedings of the Annual Design Automation Conference, DAC, pages 1--6, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Michèle Basseville and Igor V. Nikiforov. Detection of Abrupt Changes: Theory and Application. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1993.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. James Bergstra, R. Bardenet, Yoshua Bengio, and Balázs Kégl. Algorithms for Hyper-Parameter Optimization. In Proceedings of the Annual Conference on Neural Information Processing Systems, NIPS, Granada, Spain, 2011.Google ScholarGoogle Scholar
  4. James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. J. Mach. Learn. Res., 13(1):281--305, February 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc., 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Christopher M. Bishop. Pattern Recognition and Machine Learning. 2007.Google ScholarGoogle Scholar
  7. Leo Breiman. Bagging predictors. Mach. Learn., 24(2):123--140, August 1996.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Eric Brochu, Vlad M Cora, and Nando de Freitas. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. eprint arXiv:1012.2599, arXiv.org, December 2010.Google ScholarGoogle Scholar
  9. Chi Cao Minh, JaeWoong Chung, Christos Kozyrakis, and Kunle Olukotun. STAMP: Stanford Transactional Applications for Multi-Processing. In Proceedings of The IEEE International Symposium on Workload Characterization, IISWC, 2008.Google ScholarGoogle Scholar
  10. Michael J. Carey, David J. DeWitt, and Jeffrey F. Naughton. The oo7 benchmark. SIGMOD Rec., 22(2):12--21, June 1993.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Calin Cascaval, Colin Blundell, Maged Michael, Harold W Cain, Peng Wu, Stefanie Chiras, and Siddhartha Chatterjee. Software transactional memory: why is it only a research toy? Communications of the ACM, 51(11):40--46, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Márcio Castro, LuísFabrícioWanderley Góes, LuizGustavo Fernandes, and Jean-François Méhaut. Dynamic Thread Mapping Based on Machine Learning for Transactional Memory Applications. In Proceedings of the European Conference on Parallel Processing, Euro-Par, pages 465--476. 2012.Google ScholarGoogle Scholar
  13. Carlo Curino, Evan P.C. Jones, Samuel Madden, and Hari Balakrishnan. Workload-aware database monitoring and consolidation. In Proceedings of the ACM International Conference on Management of Data, SIGMOD, pages 313--324, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Luke Dalessandro, François Carouge, Sean White, Yossi Lev, Mark Moir, Michael L. Scott, and Michael F. Spear. Hybrid NOrec: A Case Study in the Effectiveness of Best Effort Hardware Transactional Memory. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 39--52, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Luke Dalessandro, Michael F. Spear, and Michael L. Scott. NOrec: Streamlining STM by Abolishing Ownership Records. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pages 67--78, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. Google news personalization: Scalable online collaborative filtering. In Proceedings of the International Conference on World Wide Web, WWW, pages 271--280, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Howard David, Eugene Gorbatov, Ulf R. Hanebutte, Rahul Khanna, and Christian Le. RAPL: Memory Power Estimation and Capping. In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED, pages 189--194, 2010.Google ScholarGoogle Scholar
  18. Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask. In Proceedings of the ACM Symposium on Operating Systems Principles, SOSP, pages 33--48, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, and Dasarathi Sampath. The YouTube Video Recommendation System. In Proceedings of the ACM Conference on Recommender Systems, RecSys, pages 293--296, 2010.Google ScholarGoogle Scholar
  20. Christina Delimitrou and Christos Kozyrakis. Paragon: QoS-aware Scheduling for Heterogeneous Datacenters. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 77--88, 2013.Google ScholarGoogle Scholar
  21. Christina Delimitrou and Christos Kozyrakis. Quasar: resource-efficient and QoS-aware cluster management. In Proceedings of Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 127--144, 2014.Google ScholarGoogle Scholar
  22. Dave Dice, Ori Shalev, and Nir Shavit. Transactional Locking II. In Proceedings of the International Conference on Distributed Computing, DISC, pages 194--208, 2006.Google ScholarGoogle Scholar
  23. David Dice, Yossi Lev, Mark Moir, and Daniel Nussbaum. Early experience with a commercial hardware transactional memory implementation. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 157--168, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Diego Didona, Pascal Felber, Derin Harmanci, Paolo Romano, and Joerg Schenker. Identifying the Optimal Level of Parallelism in Transactional Memory Applications. Computing Journal, pages 1--21, December 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Nuno Diegues and Paolo Romano. Self-Tuning Intel Transactional Synchronization Extensions. In Proceedings of the USENIX International Conference on Autonomic Computing, pages 209--219, Philadelphia, PA, 2014.Google ScholarGoogle Scholar
  26. Nuno Diegues, Paolo Romano, and Luıs Rodrigues. Virtues and Limitations of Commodity Hardware Transactional Memory. In Proceedings of the International Conference on Parallel Architectures and Compilation, PACT, pages 3--14, 2014.Google ScholarGoogle Scholar
  27. Aleksandar Dragojević, Rachid Guerraoui, and Michal Kapalka. Stretching Transactional Memory. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI, pages 155--165, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Songyun Duan, Vamsidhar Thummala, and Shivnath Babu. Tuning Database Configuration Parameters with iTuned. PVLDB, 2(1):1246--1257, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Pascal Felber, Christof Fetzer, and Torvald Riegel. Dynamic Performance Tuning of Word-based Software Transactional Memory. In Proc. of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pages 237--246, 2008.Google ScholarGoogle Scholar
  30. Rachid Guerraoui, Maurice Herlihy, and Bastian Pochon. Polymorphic Contention Management. In Proceedings of the International Conference on Distributed Computing, DISC, pages 303--323, 2005.Google ScholarGoogle Scholar
  31. Rachid Guerraoui, Maurice Herlihy, and Bastian Pochon. Toward a Theory of Transactional Contention Managers. In Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC, pages 258--264, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Rachid Guerraoui, Michal Kapalka, and Jan Vitek. STMBench7: A Benchmark for Software Transactional Memory. In Proceedings of the ACM SIGOPS European Conference on Computer Systems, EuroSys, pages 315--324, 2007.Google ScholarGoogle Scholar
  33. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The weka data mining software: An update. SIGKDD Explor. Newsl., 11(1):10--18, November 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tim Harris, James Larus, and Ravi Rajwar. Transactional Memory, 2nd Edition. Morgan and Claypool Publishers, 2nd edition, 2010.Google ScholarGoogle Scholar
  35. Maurice Herlihy and J. Eliot B. Moss. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the Annual International Symposium on Computer Architecture, ISCA, pages 289--300, 1993.Google ScholarGoogle Scholar
  36. M. Horowitz, T. Indermaur, and R. Gonzalez. Low-power digital design. In Proceedings of the IEEE Symposium on Low Power Electronics, pages 8--11, Oct 1994.Google ScholarGoogle ScholarCross RefCross Ref
  37. Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Sequential Model-based Optimization for General Algorithm Configuration. In Proceedings of the International Conference on Learning and Intelligent Optimization, LION, pages 507--523, 2011.Google ScholarGoogle Scholar
  38. Intel Corporation. Intel Transactional Memory Compiler and Runtime Application Binary Interface. https://gcc.gnu.org/wiki/TransactionalMemory?action=AttachFile&do=get&target=Intel-TM-ABI-1_1_20060506.pdf, 2009.Google ScholarGoogle Scholar
  39. Christian Jacobi, Timothy Slegel, and Dan Greiner. Transactional Memory Architecture and Implementation for IBM System Z. In Proceedings of the Annual nternational Symposium on Microarchitecture, MICRO, pages 25--36, 2012.Google ScholarGoogle Scholar
  40. Donald R. Jones, Matthias Schonlau, and William J. Welch. Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization, 13(4):455--492, December 1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. T. Karnagel, R. Dementiev, R. Rajwar, K. Lai, T. Legler, B. Schlegel, and W. Lehner. Improving in-memory database index performance with Intel Transactional Synchronization Extensions. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture, pages 476--487, 2014.Google ScholarGoogle Scholar
  42. Andi Kleen. Scaling existing lock-based applications with lock elision. Commun. ACM, 57(3):52--56, March 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Per-Ake Larson, Spyros Blanas, Cristian Diaconu, Craig Freedman, Jignesh M. Patel, and Mike Zwilling. High-performance Concurrency Control Mechanisms for Main-memory Databases. Proceedings of the VLDB Endownment, 5(4):298--309, December 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Yossi Lev, Mark Moir, and Dan Nussbaum. Phtm: Phased transactional memory. In Workshop on Transactional Computing (Transact), 2007.Google ScholarGoogle Scholar
  45. Greg Linden, Brent Smith, and Jeremy York. Amazon.Com Recommendations: Item-to-Item Collaborative Filtering. IEEE Internet Computing, 7(1):76--80, January 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Daniel Lupei, Bogdan Simion, Don Pinto, Matthew Misler, Mihai Burcea, William Krick, and Cristiana Amza. Transactional Memory Support for Scalable and Transparent Parallelization of Multiplayer Games. In Proceedings of the ACM SIGOPS European Conference on Computer Systems, EuroSys, pages 41--54, 2010.Google ScholarGoogle Scholar
  47. Alexander Matveev and Nir Shavit. Reduced Hardware Transactions: A New Approach to Hybrid Transactional Memory. In Proceedings of the Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA, pages 11--22, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Adam Morrison and Yehuda Afek. Fast Concurrent Queues for x86 Processors. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pages 103--112, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Yang Ni, Adam Welc, Ali-Reza Adl-Tabatabai, Moshe Bach, Sion Berkowits, James Cownie, Robert Geva, Sergey Kozhukow, Ravi Narayanaswamy, Jeffrey Olivier, Serguei Preis, Bratin Saha, Ady Tal, and Xinmin Tian. Design and Implementation of Transactional Constructs for C/CGoogle ScholarGoogle Scholar
  50. . In Proceedings of the ACM SIGPLAN Conference on Object-oriented Programming Systems Languages and Applications, OOPSLA, pages 195--212, 2008.Google ScholarGoogle Scholar
  51. Takayuki Osogami and Sei Kato. Optimizing System Configurations Quickly by Guessing at the Performance. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS, pages 145--156, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Sean Owen, Robin Anil, Ted Dunning, and Ellen Friedman. Mahout in Action. Manning Publications Co., Greenwich, CT, USA, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Victor Pankratius and Ali-Reza Adl-Tabatabai. Software Engineering with Transactional Memory Versus Locks in Practice. Theor. Comp. Sys., 55(3):555--590, October 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Eric Pettijohn, Yanfei Guo, Palden Lama, and Xiaobo Zhou. User-Centric Heterogeneity-Aware MapReduce Job Provisioning in the Public Cloud. In Proceedings of the International Conference on Autonomic Computing, ICAC, pages 137--143, 2014.Google ScholarGoogle Scholar
  55. Anand Rajaraman and Jeffrey David Ullman. Mining of Massive Datasets. Cambridge University Press, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2005.Google ScholarGoogle Scholar
  57. Carl Ritson and Frederick Barnes. An Evaluation of Intel's Restricted Transactional Memory for CPAs. In Proceedings of Communicating Process Architectures, CPA, pages 271--292, 2013.Google ScholarGoogle Scholar
  58. Christopher J. Rossbach, Owen S. Hofmann, and Emmett Witchel. Is Transactional Programming Actually Easier? SIGPLAN Not., 45(5):47--56, January 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Wenjia Ruan, Trilok Vyas, Yujie Liu, and Michael Spear. Transactionalizing Legacy Code: An Experience Report Using GCC and Memcached. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 399--412, New York, NY, USA, 2014. ACM.Google ScholarGoogle Scholar
  60. Diego Rughetti, Pierangelo Di Sanzo, Bruno Ciciani, and Francesco Quaglia. Machine learning-based self-adjusting concurrency in software transactional memory systems. In Proceedings of the 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, MASCOTS '12, pages 278--285, Washington, DC, USA, 2012. IEEE Computer Society.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Pierangelo Di Sanzo, Francesco Del Re, Diego Rughetti, Bruno Ciciani, and Francesco Quaglia. Regulating Concurrency in Software Transactional Memory: An Effective Model-based Approach. In Proceedings of the IEEE International Conference on Self-Adaptive and Self-Organizing Systems, SASO, pages 31--40, 2013.Google ScholarGoogle Scholar
  62. Xiaoyuan Su and Taghi M. Khoshgoftaar. A survey of collaborative filtering techniques. Adv. in Artif. Intell., 2009:4:2--4:2, January 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Chris Thornton, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, pages 847--855, 2013.Google ScholarGoogle Scholar
  64. TPC Council. TPC-C Benchmark. http://www.tpc.org/tpcc, 2011.Google ScholarGoogle Scholar
  65. Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. Speedy Transactions in Multicore In-memory Databases. In Proceedings of the ACM Symposium on Operating Systems Principles, SOSP, pages 18--32, 2013.Google ScholarGoogle Scholar
  66. Qingping Wang, Sameer Kulkarni, John Cavazos, and Michael Spear. A Transactional Memory with Automatic Performance Tuning. ACM Trans. Archit. Code Optim., 8(4):54:1--54:23, January 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Bowei Xi, Zhen Liu, Mukund Raghavachari, Cathy H. Xia, and Li Zhang. A Smart Hill-climbing Algorithm for Application Server Configuration. In Proceedings of the International Conference on World Wide Web, WWW, pages 287--296, 2004.Google ScholarGoogle Scholar
  68. Richard M. Yoo, Christopher J. Hughes, Konrad Lai, and Ravi Rajwar. Performance evaluation of Intel Transactional Synchronization Extensions for High-performance Computing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--19. ACM, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Wei Zheng, Ricardo Bianchini, G. John Janakiraman, Jose Renato Santos, and Yoshio Turner. JustRunIt: Experiment-based Management of Virtualized Data Centers. In Proceedings of the Conference on USENIX Annual Technical Conference, ATC, pages 18--18, Berkeley, CA, USA, 2009. USENIX Association.Google ScholarGoogle Scholar

Index Terms

  1. ProteusTM: Abstraction Meets Performance in Transactional Memory

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 51, Issue 4
          ASPLOS '16
          April 2016
          774 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2954679
          • Editor:
          • Andy Gill
          Issue’s Table of Contents
          • cover image ACM Conferences
            ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
            March 2016
            824 pages
            ISBN:9781450340915
            DOI:10.1145/2872362
            • General Chair:
            • Tom Conte,
            • Program Chair:
            • Yuanyuan Zhou

          Copyright © 2016 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 March 2016

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!