Abstract
On the Web, where resources such as documents and data are published, shared, transformed, and republished, provenance is a crucial piece of metadata that would allow users to place their trust in the resources they access. The open provenance model (OPM) is a community data model for provenance that is designed to facilitate the meaningful interchange of provenance information between systems. Underpinning OPM is a notion of directed graph, where nodes represent data products and processes involved in past computations and edges represent dependencies between them; it is complemented by graphical inference rules allowing new dependencies to be derived. Until now, however, the OPM model was a purely syntactical endeavor. The present article extends OPM graphs with an explicit distinction between precise and imprecise edges. Then a formal semantics for the thus enriched OPM graphs is proposed, by viewing OPM graphs as temporal theories on the temporal events represented in the graph. The original OPM inference rules are scrutinized in view of the semantics and found to be sound but incomplete. An extended set of graphical rules is provided and proved to be complete for inference. The article concludes with applications of the formal semantics to inferencing in OPM graphs, operators on OPM graphs, and a formal notion of refinement among OPM graphs.
- Peter Buneman, James Cheney, Wang-Chiew Tan, and Stijn Vansummeren. 2008. Curated databases. In Proceedings of the 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’08). ACM, New York, NY, 1--12. DOI:http://dx.doi.org/10.1145/1376916.1376918 Google Scholar
Digital Library
- James Cheney. 2010. Causality and the semantics of provenance. In Proceedings of the 6th Workshop on Developments in Computational Models (EPTCS’10). 63--74.Google Scholar
Cross Ref
- James Cheney. 2013. Semantics of the PROV Data Model. W3C Working Draft WD-prov-sem-20130312. World Wide Web Consortium.Google Scholar
- James Cheney, Laura Chiticarius, and Wang-Chiew Tan. 2009. Provenance in databases: Why, how, and where. Foundations and Trends in Databases 1, 4, 379--474. Google Scholar
Digital Library
- James Cheney, Paolo Missier, Luc Moreau, and Tom De Nies (Eds.). 2013. Constraints of the PROV Data Model. W3C Recommendation REC-prov-constraints-20130430. World Wide Web Consortium. http://www.w3.org/TR/2013/REC-prov-constraints-20130430/.Google Scholar
- Saumen Dey, Sean Riddle, and Bertram Ludäscher. 2013. Provenance analyzer: Exploring provenance semantics with logic rules. In Poceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance. https://www.usenix.org/conference/tapp13/provenance-analyzer. Google Scholar
Digital Library
- Li Ding, James Michaelis, Jim McCusker, and Deborah L. McGuinness. 2011. Linked provenance data: A Semantic Web-based approach to interoperable workflow traces. Future Generation Computer Systems 27, 6, 797--805. Google Scholar
Digital Library
- Andre Freitas, Sean O’Riain, Edward Curry, and Tomas Knap. 2011. W3P: Building an OPM based provenance model for the Web. Future Generation Computer Systems 27, 6, 766--774. Google Scholar
Digital Library
- Yolanda Gil, James Cheney, Paul Groth, Olaf Hartig, Simon Miles, Luc Moreau, Paulo Pinheiro da Silva, Sam Coppens, Daniel Garijo, Jose Manuel Gomez, Paolo Missier, Satya Sahoo, and Jun Zhao (Eds.). 2010. Provenance XG Final Report. W3C Incubator Group Report XGR-prov-20101214. World Wide Web Consortium. http://www.w3.org/2005/Incubator/prov/XGR-prov-20101214/.Google Scholar
- Paul Groth and Luc Moreau. 2011. Representing distributed systems using OPM. Future Generation Computer Systems 26, 6, 757--765. Google Scholar
Digital Library
- Joseph Y. Halpern and Judea Pearl. 2005. Causes and explanations: A structural-model approach. Part I: Causes. British Journal for the Philosophy of Science 56, 4, 843--887.Google Scholar
Cross Ref
- Richard Hull and Masatoshi Yoshikawa. 1990. ILOG: Declarative creation and manipulation of object identifiers. In Proceedings of the 16th International Conference on Very Large Data Bases. 455--468. Google Scholar
Digital Library
- Ian Jacobs and Norman Walsh. 2004. Architecture of the World Wide Web, Volume One. W3C Recommendation 15 December 2004. Retrieved April 21, 2015, from http://www.w3.org/TR/webarch/.Google Scholar
- Natalia Kwasnikowska, Luc Moreau, and Jan Van den Bussche. 2010. A Formal Account of the Open Provenance Model. eprint 271819. University of Southampton.Google Scholar
- Natalia Kwasnikowska and Jan Van den Bussche. 2008. Mapping the NRC dataflow model to the open provenance model. In Provenance and Annotation of Data and Processes. Lecture Notes in Computer Science, Vol. 5272. Springer, 3--16. Google Scholar
Digital Library
- Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21, 7, 558--565. Google Scholar
Digital Library
- Chunhyeok Lim, Shiyong Lu, Artem Chebotko, and Farshad Fotouhi. 2011. Storing, reasoning, and querying OPM-compliant scientific workflow provenance using relational databases. Future Generation Computer Systems 27, 6, 781--789. Google Scholar
Digital Library
- Friedemann Mattern. 1989. Virtual time and global states of distributed systems. In Proceedings of the International Workshop on Parallel and Distributed Algorithms. 215--226.Google Scholar
- Robert E. McGrath and Joe Futrelle. 2008. Reasoning about provenance with OWL and SWRL rules. In Proceedings of the AAAI Spring Symposium: AI Meets Business Rules and Process Management. 87--92.Google Scholar
- Simon Miles. 2011. Mapping attribution metadata to the open provenance model. Future Generation Computer Systems 27, 6, 806--811. Google Scholar
Digital Library
- Simon Miles, Paul Groth, Miguel Branco, and Luc Moreau. 2007. The requirements of using provenance in e-Science experiments. Journal of Grid Computing 5, 1, 1--25.Google Scholar
Digital Library
- Paolo Missier, Bertram Ludascher, Shawn Bowers, Saumen Dey, Anandarup Sarkar, Biva Shrestha, Ilkay Altintas, Manish K. Anand, and Carole Goble. 2010. Linking multiple workflow provenance traces for interoperable collaborative science. In Proceedings 5th Workshop on Workflows in Support of Large-Scale Science. IEEE, Los Alamitos, CA, 1--8.Google Scholar
Cross Ref
- Paolo Missier and Carole Goble. 2011. Workflows to open provenance graphs, round-trip. Future Generation Computer Systems 27, 6, 812--819. Google Scholar
Digital Library
- Luc Moreau. 2010a. Provenance-based reproducibility in the semantic Web. Journal of Web Semantics 9, 2, 202--221.Google Scholar
Cross Ref
- Luc Moreau. 2010b. The foundations for provenance on the Web. Foundations and Trends in Web Science 2, 2-3, 99--241. Google Scholar
Digital Library
- Luc Moreau, Ben Clifford, Juliana Freire, Joe Futrelle, Yolanda Gil, Paul Groth, Natalia Kwasnikowska, Simon Miles, Paolo Missier, Jim Myers, Beth Plale, Yogesh Simmhan, Eric Stephan, and Jan Van den Bussche. 2011. The open provenance model core specification (v1.1). Future Generation Computer Systems 27, 6, 743--756. Google Scholar
Digital Library
- Luc Moreau, Li Ding, Joe Futrelle, Daniel Garijo Verdejo, Paul Groth, Mike Jewell, Simon Miles, Paolo Missier, Jeff Pan, and Jun Zhao. 2010. Open Provenance Model (OPM) OWL Specification. Retrieved April 21, 2015, from http://openprovenance.org/model/opmo.Google Scholar
- Luc Moreau, Trung Dong Huynh, and Danius Michaelides. 2014. An online validator for provenance: Algorithmic design, testing, and API. In Fundamental Approaches to Software Engineering. Lecture Notes in Computer Science, Vol. 8411. Springer, 291--305. http://eprints.soton.ac.uk/340068/. Google Scholar
Digital Library
- Luc Moreau, Paolo Missier, Khalid Belhajjame, Reza B’Far, James Cheney, Sam Coppens, Stephen Cresswell, Yolanda Gil, Paul Groth, Graham Klyne, Timothy Lebo, Jim McCusker, Simon Miles, James Myers, Satya Sahoo, and Curt Tilmes (Eds.). 2013. PROV-DM: The PROV Data Model. W3C Recommendation REC-prov-dm-20130430. World Wide Web Consortium. http://www.w3.org/TR/2013/REC-prov-dm-20130430/.Google Scholar
- James Myers. 2010. I think therefore I am someone else: Understanding the confusion of granularity with continuant/occurrent and related perspective shifts. In Provenance and Annotation of Data and Processes. Lecture Notes in Computer Science, Vol. 6378. Springer, 292--294.Google Scholar
- Satya Sahoo, Paul Groth, Olaf Hartig, Simon Miles, Sam Coppens, James Myers, Yolanda Gil, Luc Moreau, Jun Zhao, Michael Panzer, and Daniel Garijo. 2010. Provenance Vocabulary Mappings. Technical Report. W3C. Available at http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings.Google Scholar
- Yogesh L. Simmhan, Beth Plale, and Dennis Gannon. 2005. A survey of data provenance in e-Science. ACM SIGMOD Record 34, 3, 31--36. DOI:http://dx.doi.org/10.1145/1084805.1084812 Google Scholar
Digital Library
- Alfred Tarski. 1986. What are logical notions? History and Philosophy of Logic 7, 2, 143--154.Google Scholar
- Gerard Tel. 1994. Introduction to Distributed Algorithms. Cambridge University Press. Google Scholar
Digital Library
- Curt Tilmes, Peter Fox, Xiaogang Ma, Deborah L. McGuinness, Ana Pinheiro Privette, Aaron Smith, Anne Waple, Stephan Zednik, and Jin Guang Zheng. 2013. Provenance representation for the national climate assessment in the global change information system. IEEE Transactions on Geoscience and Remote Sensing 51, 1, 5160--5168.Google Scholar
- Jeffrey D. Ullman. 1989. Principles of Database and Knowledge-Base Systems. Vol. II. Computer Science Press.Google Scholar
- W3C PROV. 2011. W3C Provenance Working Group Activity. Retrieved April 21, 2015, from http://www.w3.org/2011/prov/.Google Scholar
- W3C Provenance Incubator Activity. 2010. Provenance Incubator Group Charter. Retrieved April 21, 2015, from http://www.w3.org/2005/Incubator/prov/charter.Google Scholar
- Jim Woodcock and Jim Davies. 1996. Using Z. Specification, Refinement, and Proof. Prentice Hall. Google Scholar
Digital Library
Index Terms
A Formal Account of the Open Provenance Model
Recommendations
The Open Provenance Model core specification (v1.1)
The Open Provenance Model is a model of provenance that is designed to meet the following requirements: (1) Allow provenance information to be exchanged between systems, by means of a compatibility layer based on a shared provenance model. (2) Allow ...
Workflows to open provenance graphs, round-trip
The Open Provenance Model is designed to capture relationships amongst data values, and amongst processors that produce or consume those values. While OPM graphs are able to describe aspects of a workflow execution, capturing the structure of the ...
The perm provenance management system in action
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataIn this demonstration we present the Perm provenance management system (PMS). Perm is capable of computing, storing and querying provenance information for the relational data model. Provenance is computed by using query rewriting techniques to annotate ...






Comments