skip to main content
research-article

Application-aware management of parallel simulation collections

Published:14 February 2009Publication History
Skip Abstract Section

Abstract

This paper presents a system deployed on parallel clusters to manage a collection of parallel simulations that make up a computational study. It explores how such a system can extend traditional parallel job scheduling and resource allocation techniques to incorporate knowledge specific to the study.

Using a UINTAH-based helium gas simulation code (ARCHES) and the SimX system for multi-experiment computational studies, this paper demonstrates that, by using application-specific knowledge in resource allocation and scheduling decisions, one can reduce the run time of a computational study from over 20 hours to under 4.5 hours on a 32-processor cluster, and from almost 11 hours to just over 3.5 hours on a 64-processor cluster.

References

  1. D. Abramson, A. Lewis, T. Peachey, and C. Fletcher. An automatic design optimization tool and its application to computational fluid dynamics. In Proc. SC'01, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Abramson, R. Sosic, J. Giddy, and B. Hall. Nimrod: A tool for performing parameterised simulations using distributed workstations. In Proc. HPDC, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. H. Casanova and J. Dongarra. Netsolve: A network server for computational science problems. Intl. J. of Supercomp. Appl. and High Perf. Comp., 11(3):212--223, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Davison de St. Germain, J. McCorquodale, S. Parker, and C. Johnson. Uintah: a massively parallel problem solving environment. In Proc. HPDC, pages 33--41, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. E. DesJardin, T. J. O'Hern, and S. R. Tieszen. Large eddy simulation and experimental measurements of the near-field of a large turbulent helium plume. Physics of Fluids, 16(6):1866--1883, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  6. M. Faerman, A. Birnbaum, H. Casanova, and F. Berman. Resource allocation for steerable parallel parameter searches. In Proc. Grid'02, Nov 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. G. Feitelson. Job scheduling in multiprogrammed parallel systems. IBM Research Report RC 19790 (87657), Aug 1997.Google ScholarGoogle Scholar
  8. Fujita and Yamashita. Approximation algorithms for multiprocessor scheduling problem. TIEICE, 2000.Google ScholarGoogle Scholar
  9. M. Gries. Methods for evaluating and covering the design space during early design development. Integration, the VLSI Journal, 38(2):131--183, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Messac. Physical programming: Effective optimization for computational design. AIAA Journal, 31(4):149--158, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  11. J. Nabrzyski, J. Schopf, and J. Weglarz, editors. Grid Resource Management: State of the Art and Future Trends. Kluwer, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Parker and C. Johnson. SCIRun: a scientific programming environment for computational steering. In Proc. SC'95, pages 1419--39, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Parker, M. Miller, C. Hansen, and C. Johnson. An integrated problem solving environment: the SCIRun computational steering system. In Proc. HICSS, volume vol.7, pages 147--56, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Schmidt and C. Johnson. DefibSim: An interactive defibrillation device design tool. In Proc. EMBS Conf., 1995.Google ScholarGoogle ScholarCross RefCross Ref
  15. M. Scott and E. Antonsson. Preliminary vehicle structure design: An industrial application of imprecision in engineering design.Google ScholarGoogle Scholar
  16. J. Spinti, J. Thornock, E. Eddings, P. Smith, and A. Sarofim. Transport Phenomena in Fires, chapter Heat Transfer to objects in pool fires. Witpress, 2008.Google ScholarGoogle Scholar
  17. S. Srinivasan, S. Krishnamoorthy, and P. Sadayappan. A robust scheduling strategy for moldable scheduling of parallel jobs. Cluster, 00:92, 2003.Google ScholarGoogle Scholar
  18. S. Srinivasan, V. Subramani, R. Kettimuthu, P. Holenarsipur, and P. Sadayappan. Effective selection of partition sizes for moldable scheduling of parallel jobs. In HiPC, pages 174--183, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. V. Subramani and R. Kettimuthu. Selective buddy allocation for scheduling parallel jobs on clusters. In Cluster 2002, pages 107--116, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Thain, T. Tannenbaum, and M. Livny. Distributed computing in practice: The condor experience. CC-PE, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Wilson, D. Cappelleri, T. W. Simpson, and M. Frecker. Efficient pareto frontier exploration using surrogate approximations. Optimization and Engineering, 2(1):31--50, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  22. S. Yau, K. Damevski, V. Karamcheti, S. Parker, and D. Zorin. Result reuse in design space exploration: A study in system support for interactive parallel computing. In Proc. IPDPS, 2008.Google ScholarGoogle Scholar
  23. S. Yau, E. Grinspun, V. Karamcheti, and D. Zorin. Sim-X: Parallel system software for interactive multi-experiment computational studies. In Proc. IPDPS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. M. Yau, E. Grinspun, V. Karamcheti, and D. Zorin. SimX meets SCIRun: A component-based implementation of a computational study system. In NSFNGS Workshop, IPDPS, pages 1--6, 2007.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Application-aware management of parallel simulation collections

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 44, Issue 4
          PPoPP '09
          April 2009
          294 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/1594835
          Issue’s Table of Contents
          • cover image ACM Conferences
            PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
            February 2009
            322 pages
            ISBN:9781605583976
            DOI:10.1145/1504176

          Copyright © 2009 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 February 2009

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!