skip to main content
10.1145/3437359.3465600acmconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
short-paper
Open Access

RESIF 3.0: Toward a Flexible & Automated Management of User Software Environment on HPC facility ✱

Published:17 July 2021Publication History

ABSTRACT

High Performance Computing (HPC) is increasingly identified as a strategic asset and enabler to accelerate the research and the business performed in all areas requiring intensive computing and large-scale Big Data analytic capabilities. The efficient exploitation of heterogeneous computing resources featuring different processor architectures and generations, coupled with the eventual presence of GPU accelerators, remains a challenge. The University of Luxembourg operates since 2007 a large academic HPC facility which remains one of the reference implementation within the country and offers a cutting-edge research infrastructure to Luxembourg public research. The HPC support team invests a significant amount of time (i.e., several months of effort per year) in providing a software environment optimised for hundreds of users, but the complexity of HPC software was quickly outpacing the capabilities of classical software management tools. Since 2014, our scientific software stack is generated and deployed in an automated and consistent way through the RESIF framework, a wrapper on top of Easybuild and Lmod [5] meant to efficiently handle user software generation. A large code refactoring was performed in 2017 to better handle different software sets and roles across multiple clusters, all piloted through a dedicated control repository. With the advent in 2020 of a new supercomputer featuring a different CPU architecture, and to mitigate the identified limitations of the existing framework, we report in this state-of-practice article RESIF 3.0, the latest iteration of our scientific software management suit now relying on streamline Easybuild. It permitted to reduce by around 90% the number of custom configurations previously enforced by specific Slurm and MPI settings, while sustaining optimised builds coexisting for different dimensions of CPU and GPU architectures. The workflow for contributing back to the Easybuild community was also automated and a current work in progress aims at drastically decrease the building time of a complete software set generation. Overall, most design choices for our wrapper have been motivated by several years of experience in addressing in a flexible and convenient way the heterogeneous needs inherent to an academic environment aiming for research excellence. As the code base is available publicly, and as we wish to transparently report also the pitfalls and difficulties met, this tool may thus help other HPC centres to consolidate their own software management stack.

References

  1. O. Ben-Kiki, C. Evans, and B. Ingerson. 2009. YAML Ain’t Markup Language.Google ScholarGoogle Scholar
  2. R. H. Castain, J. Hursey, A. Bouteiller, and D. Solt. 2018. PMIx: Process management for exascale environments. Parallel Comput. 79(2018), 9–29.Google ScholarGoogle ScholarCross RefCross Ref
  3. R. Falke, R. Klein, R. Koschke, and J. Quante. 2005. The Dominance Tree in Visualizing Software Dependencies. In 3rd IEEE Intl. W. on Visualizing Software for Understanding and Analysis. IEEE, Budapest, Hungary, 1–6.Google ScholarGoogle Scholar
  4. T. Gamblin, M. LeGendre, M. R. Collette, G. L. Lee, A. Moody, B. R. de Supinski, and S. Futral. 2015. The Spack package manager: bringing order to HPC software chaos. In SC ’15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, Austin, TX, USA, 1–12.Google ScholarGoogle Scholar
  5. M. Geimer, K. Hoste, and R. McLay. 2014. Modern Scientific Software Management Using EasyBuild and Lmod. In 2014 First International Workshop on HPC User Support Tools. IEEE, New Orleans, LA, USA, 41–51. https://doi.org/10.1109/HUST.2014.8Google ScholarGoogle Scholar
  6. S. Khuvis, Z-Q. You, H. Na, S. Brozell, E. Franz, T. Dockendorf, J. Gardiner, and K. Tomko. 2019. A Continuous Integration-Based Framework for Software Management. In Proc. of the Practice and Experience in Advanced Research Computing (PEARC’19). ACM, New York, NY, USA, 1–7.Google ScholarGoogle Scholar
  7. D. Matthews and W. Limberg. 2018. MkDocs: documentation with Markdown. mkdocs.org.Google ScholarGoogle Scholar
  8. R. Mc.Lay. 2013. LMod: A New Environment Module System. https://lmod.rtfd.io.Google ScholarGoogle Scholar
  9. PuppetLabs. 2015. Puppet Hiera. https://puppet.com/docs/hiera/.Google ScholarGoogle Scholar
  10. S. Varrette, P. Bouvry, H. Cartiaux, and F. Georgatos. 2014. Management of an Academic HPC Cluster: The UL Experience. In Proc. of the 2014 Intl. Conf. on High Performance Computing & Simulation (HPCS 2014). IEEE, Bologna, Italy, 959–967.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. RESIF 3.0: Toward a Flexible & Automated Management of User Software Environment on HPC facility ✱
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            PEARC '21: Practice and Experience in Advanced Research Computing
            July 2021
            310 pages
            ISBN:9781450382922
            DOI:10.1145/3437359

            Copyright © 2021 Owner/Author

            This work is licensed under a Creative Commons Attribution International 4.0 License.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 17 July 2021

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • short-paper
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate133of202submissions,66%

            Upcoming Conference

            PEARC '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format