skip to main content
research-article

A Formal Account of the Open Provenance Model

Authors Info & Claims
Published:13 May 2015Publication History
Skip Abstract Section

Abstract

On the Web, where resources such as documents and data are published, shared, transformed, and republished, provenance is a crucial piece of metadata that would allow users to place their trust in the resources they access. The open provenance model (OPM) is a community data model for provenance that is designed to facilitate the meaningful interchange of provenance information between systems. Underpinning OPM is a notion of directed graph, where nodes represent data products and processes involved in past computations and edges represent dependencies between them; it is complemented by graphical inference rules allowing new dependencies to be derived. Until now, however, the OPM model was a purely syntactical endeavor. The present article extends OPM graphs with an explicit distinction between precise and imprecise edges. Then a formal semantics for the thus enriched OPM graphs is proposed, by viewing OPM graphs as temporal theories on the temporal events represented in the graph. The original OPM inference rules are scrutinized in view of the semantics and found to be sound but incomplete. An extended set of graphical rules is provided and proved to be complete for inference. The article concludes with applications of the formal semantics to inferencing in OPM graphs, operators on OPM graphs, and a formal notion of refinement among OPM graphs.

References

  1. Peter Buneman, James Cheney, Wang-Chiew Tan, and Stijn Vansummeren. 2008. Curated databases. In Proceedings of the 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’08). ACM, New York, NY, 1--12. DOI:http://dx.doi.org/10.1145/1376916.1376918 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. James Cheney. 2010. Causality and the semantics of provenance. In Proceedings of the 6th Workshop on Developments in Computational Models (EPTCS’10). 63--74.Google ScholarGoogle ScholarCross RefCross Ref
  3. James Cheney. 2013. Semantics of the PROV Data Model. W3C Working Draft WD-prov-sem-20130312. World Wide Web Consortium.Google ScholarGoogle Scholar
  4. James Cheney, Laura Chiticarius, and Wang-Chiew Tan. 2009. Provenance in databases: Why, how, and where. Foundations and Trends in Databases 1, 4, 379--474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. James Cheney, Paolo Missier, Luc Moreau, and Tom De Nies (Eds.). 2013. Constraints of the PROV Data Model. W3C Recommendation REC-prov-constraints-20130430. World Wide Web Consortium. http://www.w3.org/TR/2013/REC-prov-constraints-20130430/.Google ScholarGoogle Scholar
  6. Saumen Dey, Sean Riddle, and Bertram Ludäscher. 2013. Provenance analyzer: Exploring provenance semantics with logic rules. In Poceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance. https://www.usenix.org/conference/tapp13/provenance-analyzer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Li Ding, James Michaelis, Jim McCusker, and Deborah L. McGuinness. 2011. Linked provenance data: A Semantic Web-based approach to interoperable workflow traces. Future Generation Computer Systems 27, 6, 797--805. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Andre Freitas, Sean O’Riain, Edward Curry, and Tomas Knap. 2011. W3P: Building an OPM based provenance model for the Web. Future Generation Computer Systems 27, 6, 766--774. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yolanda Gil, James Cheney, Paul Groth, Olaf Hartig, Simon Miles, Luc Moreau, Paulo Pinheiro da Silva, Sam Coppens, Daniel Garijo, Jose Manuel Gomez, Paolo Missier, Satya Sahoo, and Jun Zhao (Eds.). 2010. Provenance XG Final Report. W3C Incubator Group Report XGR-prov-20101214. World Wide Web Consortium. http://www.w3.org/2005/Incubator/prov/XGR-prov-20101214/.Google ScholarGoogle Scholar
  10. Paul Groth and Luc Moreau. 2011. Representing distributed systems using OPM. Future Generation Computer Systems 26, 6, 757--765. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Joseph Y. Halpern and Judea Pearl. 2005. Causes and explanations: A structural-model approach. Part I: Causes. British Journal for the Philosophy of Science 56, 4, 843--887.Google ScholarGoogle ScholarCross RefCross Ref
  12. Richard Hull and Masatoshi Yoshikawa. 1990. ILOG: Declarative creation and manipulation of object identifiers. In Proceedings of the 16th International Conference on Very Large Data Bases. 455--468. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ian Jacobs and Norman Walsh. 2004. Architecture of the World Wide Web, Volume One. W3C Recommendation 15 December 2004. Retrieved April 21, 2015, from http://www.w3.org/TR/webarch/.Google ScholarGoogle Scholar
  14. Natalia Kwasnikowska, Luc Moreau, and Jan Van den Bussche. 2010. A Formal Account of the Open Provenance Model. eprint 271819. University of Southampton.Google ScholarGoogle Scholar
  15. Natalia Kwasnikowska and Jan Van den Bussche. 2008. Mapping the NRC dataflow model to the open provenance model. In Provenance and Annotation of Data and Processes. Lecture Notes in Computer Science, Vol. 5272. Springer, 3--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21, 7, 558--565. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chunhyeok Lim, Shiyong Lu, Artem Chebotko, and Farshad Fotouhi. 2011. Storing, reasoning, and querying OPM-compliant scientific workflow provenance using relational databases. Future Generation Computer Systems 27, 6, 781--789. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Friedemann Mattern. 1989. Virtual time and global states of distributed systems. In Proceedings of the International Workshop on Parallel and Distributed Algorithms. 215--226.Google ScholarGoogle Scholar
  19. Robert E. McGrath and Joe Futrelle. 2008. Reasoning about provenance with OWL and SWRL rules. In Proceedings of the AAAI Spring Symposium: AI Meets Business Rules and Process Management. 87--92.Google ScholarGoogle Scholar
  20. Simon Miles. 2011. Mapping attribution metadata to the open provenance model. Future Generation Computer Systems 27, 6, 806--811. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Simon Miles, Paul Groth, Miguel Branco, and Luc Moreau. 2007. The requirements of using provenance in e-Science experiments. Journal of Grid Computing 5, 1, 1--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Paolo Missier, Bertram Ludascher, Shawn Bowers, Saumen Dey, Anandarup Sarkar, Biva Shrestha, Ilkay Altintas, Manish K. Anand, and Carole Goble. 2010. Linking multiple workflow provenance traces for interoperable collaborative science. In Proceedings 5th Workshop on Workflows in Support of Large-Scale Science. IEEE, Los Alamitos, CA, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  23. Paolo Missier and Carole Goble. 2011. Workflows to open provenance graphs, round-trip. Future Generation Computer Systems 27, 6, 812--819. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Luc Moreau. 2010a. Provenance-based reproducibility in the semantic Web. Journal of Web Semantics 9, 2, 202--221.Google ScholarGoogle ScholarCross RefCross Ref
  25. Luc Moreau. 2010b. The foundations for provenance on the Web. Foundations and Trends in Web Science 2, 2-3, 99--241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Luc Moreau, Ben Clifford, Juliana Freire, Joe Futrelle, Yolanda Gil, Paul Groth, Natalia Kwasnikowska, Simon Miles, Paolo Missier, Jim Myers, Beth Plale, Yogesh Simmhan, Eric Stephan, and Jan Van den Bussche. 2011. The open provenance model core specification (v1.1). Future Generation Computer Systems 27, 6, 743--756. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Luc Moreau, Li Ding, Joe Futrelle, Daniel Garijo Verdejo, Paul Groth, Mike Jewell, Simon Miles, Paolo Missier, Jeff Pan, and Jun Zhao. 2010. Open Provenance Model (OPM) OWL Specification. Retrieved April 21, 2015, from http://openprovenance.org/model/opmo.Google ScholarGoogle Scholar
  28. Luc Moreau, Trung Dong Huynh, and Danius Michaelides. 2014. An online validator for provenance: Algorithmic design, testing, and API. In Fundamental Approaches to Software Engineering. Lecture Notes in Computer Science, Vol. 8411. Springer, 291--305. http://eprints.soton.ac.uk/340068/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Luc Moreau, Paolo Missier, Khalid Belhajjame, Reza B’Far, James Cheney, Sam Coppens, Stephen Cresswell, Yolanda Gil, Paul Groth, Graham Klyne, Timothy Lebo, Jim McCusker, Simon Miles, James Myers, Satya Sahoo, and Curt Tilmes (Eds.). 2013. PROV-DM: The PROV Data Model. W3C Recommendation REC-prov-dm-20130430. World Wide Web Consortium. http://www.w3.org/TR/2013/REC-prov-dm-20130430/.Google ScholarGoogle Scholar
  30. James Myers. 2010. I think therefore I am someone else: Understanding the confusion of granularity with continuant/occurrent and related perspective shifts. In Provenance and Annotation of Data and Processes. Lecture Notes in Computer Science, Vol. 6378. Springer, 292--294.Google ScholarGoogle Scholar
  31. Satya Sahoo, Paul Groth, Olaf Hartig, Simon Miles, Sam Coppens, James Myers, Yolanda Gil, Luc Moreau, Jun Zhao, Michael Panzer, and Daniel Garijo. 2010. Provenance Vocabulary Mappings. Technical Report. W3C. Available at http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings.Google ScholarGoogle Scholar
  32. Yogesh L. Simmhan, Beth Plale, and Dennis Gannon. 2005. A survey of data provenance in e-Science. ACM SIGMOD Record 34, 3, 31--36. DOI:http://dx.doi.org/10.1145/1084805.1084812 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Alfred Tarski. 1986. What are logical notions? History and Philosophy of Logic 7, 2, 143--154.Google ScholarGoogle Scholar
  34. Gerard Tel. 1994. Introduction to Distributed Algorithms. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Curt Tilmes, Peter Fox, Xiaogang Ma, Deborah L. McGuinness, Ana Pinheiro Privette, Aaron Smith, Anne Waple, Stephan Zednik, and Jin Guang Zheng. 2013. Provenance representation for the national climate assessment in the global change information system. IEEE Transactions on Geoscience and Remote Sensing 51, 1, 5160--5168.Google ScholarGoogle Scholar
  36. Jeffrey D. Ullman. 1989. Principles of Database and Knowledge-Base Systems. Vol. II. Computer Science Press.Google ScholarGoogle Scholar
  37. W3C PROV. 2011. W3C Provenance Working Group Activity. Retrieved April 21, 2015, from http://www.w3.org/2011/prov/.Google ScholarGoogle Scholar
  38. W3C Provenance Incubator Activity. 2010. Provenance Incubator Group Charter. Retrieved April 21, 2015, from http://www.w3.org/2005/Incubator/prov/charter.Google ScholarGoogle Scholar
  39. Jim Woodcock and Jim Davies. 1996. Using Z. Specification, Refinement, and Proof. Prentice Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Formal Account of the Open Provenance Model

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 9, Issue 2
      May 2015
      150 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/2776789
      Issue’s Table of Contents

      Copyright © 2015 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 May 2015
      • Accepted: 1 February 2015
      • Revised: 1 November 2014
      • Received: 1 March 2012
      Published in tweb Volume 9, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!