skip to main content
research-article

Efficiently combining parallel software using fine-grained, language-level, hierarchical resource management policies

Published:19 October 2012Publication History
Skip Abstract Section

Abstract

This paper presents Poli-C, a language extension, runtime library, and system daemon enabling fine-grained, language-level, hierarchical resource management policies. Poli-C is suitable for use in applications that compose parallel libraries, frameworks, and programs. In particular, we have added a powerful new statement to C for expressing resource limits and guarantees in such a way that programmers can set resource management policies even when the source code of parallel libraries and frameworks is not available. Poli-C enables application programmers to manage any resource exposed by the underlying OS, for example cores or IO bandwidth. Additionally, we have developed a domain-specific language for defining high-level resource management policies, and a facility for extending the kinds of resources that can be managed with our language extension. Finally, through a number of useful variations, our design offers a high degree of composability. We evaluate Poli-C by way of three case-studies: a scientific application, an image processing webserver, and a pair of parallel database join implementations. We found that using Poli-C yields efficiency gains that require the addition of only a few lines of code to applications.

References

  1. The go programming language, Oct. 2011. http://golang.org/.Google ScholarGoogle Scholar
  2. lxc: Linux containers, Sept. 2011. http://lxc.sf.net/.Google ScholarGoogle Scholar
  3. System administration guide: Oracle solaris containers-resource management and oracle solaris zones, Sept. 2011. http://docs.sun.com/app/docs/doc/817--1592.Google ScholarGoogle Scholar
  4. Apache HTTP server project, Apr. 2012. http://httpd.apache.org/.Google ScholarGoogle Scholar
  5. cgic: an ANSI C library for CGI programming, Apr. 2012. http://www.boutell.com/cgic/.Google ScholarGoogle Scholar
  6. ImageMagick: convert, edit, and compose images, Apr. 2012. http://www.imagemagick.org.Google ScholarGoogle Scholar
  7. Allen, E., Chase, D., Luchangco, V., Jr., J.-W. M. S. R. G. L. S., and Tobin-Hochstadt, S. The fortress language specification version 1.0, 2008. http://research.sun.com/projects/plrg/fortress.pdf.Google ScholarGoogle Scholar
  8. Asanovic, K., Bodik, R., Catanzaro, B. C., Gebis, J. J., Husbands, P., Keutzer, K., Patterson, D. A., Plishker, W. L., Shalf, J., Williams, S. W., and Yelick, K. A. The landscape of parallel computing research: A view from berkeley. Tech. Rep. UCB/EECS-2006--183, EECS Department, University of California, Berkeley, Dec 2006.Google ScholarGoogle Scholar
  9. Baumann, A., Barham, P., Dagand, P.-E., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schüpbach, A., and Singhania, A. The multikernel: a new os architecture for scalable multicore systems. In SOSP'09, pp. 29--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bienia, C., Kumar, S., Singh, J. P., and Li, K. The parsec benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (October 2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Boehm, H.-J. Threads cannot be implemented as a library. In PLDI'05, pp. 261--268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Boisvert, R. F., Pozo, R., Remington, K., Barrett, R. F., and Dongarra, J. J. Matrix market: a web resource for test matrix collections. In Proceedings of the IFIP TC2/WG2.5 working conference on Quality of numerical software: assessment and enhancement (London, UK, UK, 1997), Chapman & Hall, Ltd., pp. 125--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chamberlain, B., Callahan, D., and Zima, H. Parallel programmability and the chapel language. Int. J. High Perform. Comput. Appl. 21, 3 (2007), 291--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Davis, T. A. Algorithm 915, SuiteSparseQR: Multifrontal multithreaded rank-revealing sparse QR factorization. ACM Transactions on Mathematical Software 38, 1 (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ebcioglu, K., Saraswat, V., and Sarkar, V. X10: Programming for hierarchical parallelism and non-uniform data access. In OOPSLA'04.Google ScholarGoogle Scholar
  16. et al., C. L. Basic linear algebra subprograms for FORTRAN. In Transactions on Mathematical Software (1979). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Fluet, M., Rainey, M., and Reppy, J. A scheduling framework for general-purpose parallel languages. In ICFP'08, pp. 241--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ford, B., and Susarla, S. Cpu inheritance scheduling. In OSDI'96, pp. 91--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Frigo, M. Multithreaded programming in cilk. In Proceedings of the 2007 international workshop on Parallel symbolic computation (2007), pp. 13--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Giceva, J. Database-operating system co-design. Master's thesis, ETH Zürich, May 2011.Google ScholarGoogle Scholar
  21. Grossman, D. The transactional memory / garbage collection analogy. In OOPSLA'07, pp. 695--706. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Harris, T., Abadi, M., Isaacs, R., and McIlroy, R. AC: Composable asynchronous io for native languages. In OOPSLA'11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A., Shenker, S., and Stoica, I. Nexus: A common substrate for cluster computing. Tech. Rep. UCB/EECS-2009--158, EECS Department, University of California, Berkeley, 2009.Google ScholarGoogle Scholar
  24. Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A. D., Katz, R., Shenker, S., and Stoica, I. Mesos: a platform for fine-grained resource sharing in the data center. In NSDI'11, pp. 22--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Intel. Math kernel library for the linux operating system: User's guide, 2007.Google ScholarGoogle Scholar
  26. Kalé, L. V., Yelon, J., and Knuff, T. Threads for interoperable parallel programming. In Proceedings of the 9th International Workshop on Languages and Compilers for Parallel Computing (1997), pp. 534--552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Krishnamurthy, A., Culler, D. E., Dusseau, A., Goldstein, S. C., Lumetta, S., von Eicken, T., and Yelick, K. Parallel Programming in Split-C. In SUPERCOM'93, pp. 262--273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Lee, R., Ding, X., Chen, F., Lu, Q., and Zhang, X. Mcc-db: minimizing cache conflicts in multi-core processors for databases. Proc. VLDB Endow. 2, 1 (Aug. 2009), 373--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Li, P., Marlow, S., Peyton Jones, S., and Tolmach, A. Lightweight concurrency primitives for ghc. In Proceedings of the ACM SIGPLAN workshop on Haskell workshop (2007), pp. 107--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Liu, R., Klues, K., Bird, S., Hofmeyr, S., Asanović, K., and Kubiatowicz, J. Tessellation: space-time partitioning in a manycore client os. In HotPar'09, pp. 10--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mackey, L., Talwalkar, A., and Jordan, M. Divide-and-conquer matrix factorization. In Neural Information Processing Systems (NIPS) (2011).Google ScholarGoogle Scholar
  32. Marsh, B. D., Scott, M. L., LeBlanc, T. J., and Markatos, E. P. First-class user-level threads. In SOSP'91, pp. 110--121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. McIver, L., and Conway, D. Seven deadly sins of introductory programming language design. In Proceedings of the 1996 International Conference on Software Engineering: Education and Practice (SE:EP '96), pp. 309--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Menage, P. Cgroups, July 2011. http://www.kernel.org/doc/Documentation/cgroups/cgroups.txt.Google ScholarGoogle Scholar
  35. Necula, G. C., McPeak, S., and Weimer, W. CIL: Intermediate language and tools for the analysis of C programs. In CC'04, pp. 213--228. http://cil.sourceforge.net/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Pan, H., Hindman, B., and Asanović, K. Composing parallel software efficiently with lithe. In PLDI'10, pp. 376--387. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Regehr, J., and Stankovic, J. A. Hls: A framework for composing soft real-time schedulers. In Proceedings of the 22nd IEEE Real-Time Systems Symposium (2001), pp. 3--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Reinders, J. Intel threading building blocks: outfitting C+ for multi-core processor parallelism. O'Reilly, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Rinard, M. C., and Lam, M. S. The design, implementation, and evaluation of jade. ACM Trans. Program. Lang. Syst. 20, 3 (1998), 483--545. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Saha, B., Adl-Tabatabai, A.-R., Ghuloum, A., Rajagopalan, M., Hudson, R. L., Petersen, L., Menon, V., Murphy, B., Shpeisman, T., Sprangle, E., Rohillah, A., Carmean, D., and Fang, J. Enabling scalability and performance in a large scale CMP environment. In EuroSys'07, pp. 73--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Satyanarayanan, M., Mashburn, H. H., Kumar, P., Steere, D. C., and Kistler, J. J. Lightweight recoverable virtual memory. In SOSP'93, pp. 146--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Taylor, G., Davies, P., and Farmwald, M. The tlb slice: a low-cost high-speed address translation mechanism. In Proceedings of the 17th annual international symposium on Computer Architecture (New York, NY, USA, 1990), ISCA '90, ACM, pp. 355--363. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Wentzlaff, D., and Agarwal, A. Factored operating systems (fos): the case for a scalable operating system for multicores. SIGOPS Oper. Syst. Rev. 43 (April 2009), 76--85. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficiently combining parallel software using fine-grained, language-level, hierarchical resource management policies

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 47, Issue 10
          OOPSLA '12
          October 2012
          1011 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2398857
          Issue’s Table of Contents
          • cover image ACM Conferences
            OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applications
            October 2012
            1052 pages
            ISBN:9781450315616
            DOI:10.1145/2384616

          Copyright © 2012 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 October 2012

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!