10.1145/1272680.1272684acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
Article

A uniform job monitoring service in multiple job universes

Online:25 June 2007Publication History

ABSTRACT

We describe an ongoing work of extending the gLite Logging and Bookkeeping (L&B) service to be able to track additional types of jobs, with the vision of being able to uniformly follow jobs on the Grid, even when they pass between different middleware domains. Details are given on the simpler case of PBS jobs, which prove the cababilityof L&B to deal with additional job types,as well as started more complex and challenging work on Condor jobs, where theimpact of eventual success is larger.

References

  1. Zoltan Balaton et al. From cluster monitoring to grid monitoring based on grm. In Proc. 7th EuroPar2001 Parallel Processings, Manchester, UK., pages 874--881, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R Byrom et al. APEL: An implementation of Grid accounting using R-GMA. UK e-Science All Hands Conference, Nottingham, 2005.Google ScholarGoogle Scholar
  3. Chiara Curti et al. On advance reservation of heterogeneous network paths. Future Generation Computer Systems, 21(4), 2005.Google ScholarGoogle Scholar
  4. S. Fisher. Relational model for information and monitoring. Technical Report GWD-Perf-7-1, GGF, 2001.Google ScholarGoogle Scholar
  5. R. Henderson and D. Tweten. Portable batch system: External reference specification. NASA, Ames Research Center, 1996.Google ScholarGoogle Scholar
  6. Ales Křenek et al. L&B Users Guide. https://edms.cern.ch/file/571273/1/.Google ScholarGoogle Scholar
  7. E. Laure et al. Programming the Grid with gLite. Computational Methods in Science and Technology, 12(1):33--45, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  8. Erwin Laure et al. Middleware for the next generation grid infrastructure. In Computing in High Energy Physics and Nuclear Physics (CHEP 2004), 2004.Google ScholarGoogle Scholar
  9. Ludêk Matyska et al. Job tracking on a grid-the Logging and Bookkeeping and Job Provenance services. Technical Report 4/2007, CESNET, 2007. http://www.cesnet.cz/doc/techzpravy/.Google ScholarGoogle Scholar
  10. Gavin McCance etal. File transfer service. glite User Guide, 2005. https://edms.cern.ch/file/591792/1/.Google ScholarGoogle Scholar
  11. HB. Newman et al. MonALISA: a distributed monitoring service architecture. In Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, CA, 2003.Google ScholarGoogle Scholar
  12. M. Ruda et al. Logging and bookkeeping architecture for DataGrid Release 2. Technical report, EU DataGrid, 2002. Part of Deliverable D1.2.Google ScholarGoogle Scholar
  13. Douglas Thain, Todd Tannenbaum, and Miron Livny. Distributed computing in practice: the condor experience. Concurrency-Practice and Experience, 17(2-4):323--356, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michal Vocu et al. The megajob challenge-LB performance tests. EGEE JRA1 All-hands meeting, 2006. http://indico.cern.ch/conferenceDisplay.py?confId=a062598.Google ScholarGoogle Scholar
  15. S. Zhou. LSF: load sharing in large-scale heterogenous distributed systems. In Proceedings of the Workshop on Cluster Computing, 1992.Google ScholarGoogle Scholar

Index Terms

  1. A uniform job monitoring service in multiple job universes

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!