ABSTRACT
Data in real-life databases become obsolete rapidly. One often finds that multiple values of the same entity reside in a database. While all of these values were once correct, most of them may have become stale and inaccurate. Worse still, the values often do not carry reliable timestamps. With this comes the need for studying data currency, to identify the current value of an entity in a database and to answer queries with the current values, in the absence of timestamps.
This paper investigates the currency of data. (1) We propose a model that specifies partial currency orders in terms of simple constraints. The model also allows us to express what values are copied from other data sources, bearing currency orders in those sources, in terms of copy functions defined on correlated attributes. (2) We study fundamental problems for data currency, to determine whether a specification is consistent, whether a value is more current than another, and whether a query answer is certain no matter how partial currency orders are completed. (3) Moreover, we identify several problems associated with copy functions, to decide whether a copy function imports sufficient current data to answer a query, whether such a function copies redundant data, whether a copy function can be extended to import necessary current data for a query while respecting the constraints, and whether it suffices to copy data of a bounded size. (4) We establish upper and lower bounds of these problems, all matching, for combined complexity and data complexity, and for a variety of query languages. We also identify special cases that warrant lower complexity.
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google Scholar
Digital Library
- L. Berti-Equille, A. D. Sarma, X. Dong, A. Marian, and D. Srivastava. Sailing the information ocean with awareness of currents: Discovery and application of source dependence. In CIDR, 2009.Google Scholar
- L. Bertossi. Consistent query answering in databases. SIGMOD Rec., 35(2), 2006. Google Scholar
Digital Library
- M. Bodirsky and J. Kara. The complexity of temporal constraint satisfaction problems. JACM, 57(2), 2010. Google Scholar
Digital Library
- P. Buneman, J. Cheney, W. Tan, and S. Vansummeren. Curated databases. In PODS, 2008. Google Scholar
Digital Library
- J. Cheney, L. Chiticariu, and W. C. Tan. Provenance in databases: Why, how, and where. Foundations and Trends in Databases, 1(4):379--474, 2009. Google Scholar
Digital Library
- J. Chomicki. Consistent query answering: Five easy pieces. In ICDT, 2007. Google Scholar
Digital Library
- J. Chomicki and D. Toman. Time in database systems. In M. Fisher, D. Gabbay, and L. Vila, editors, Hand-book of Temporal Reasoning in Artificial Intelligence. Elsevier, 2005.Google Scholar
Cross Ref
- J. Clifford, C. E. Dyreson, T. Isakowitz, C. S. Jensen, and R. T. Snodgrass. On the semantics of "now" in databases. TODS, 22(2):171--214, 1997. Google Scholar
Digital Library
- E. F. Codd. Extending the database relational model to capture more meaning. TODS, 4(4):397--434, 1979. Google Scholar
Digital Library
- A. Deutsch, A. Nash, and J. B. Remmel. The chase revisited. In PODS, 2008. Google Scholar
Digital Library
- X. Dong, L. Berti-Equille, Y. Hu, and D. Srivastava. Global detection of complex copying relationships between sources. In VLDB, 2010. Google Scholar
Digital Library
- X. Dong, L. Berti-Equille, and D. Srivastava. Truth discovery and copying detection in a dynamic world. In VLDB, 2009. Google Scholar
Digital Library
- C. E. Dyreson, C. S. Jensen, and R. T. Snodgrass. Now in temporal databases. In L. Liu and M. T. Ozsu, editors, Encyclopedia of Database Systems. Springer, 2009.Google Scholar
Cross Ref
- W. W. Eckerson. Data quality and the bottom line: Achieving business success through a commitment to high quality data. Data Warehousing Institute, 2002.Google Scholar
- A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. TKDE, 19(1):1--16, 2007. Google Scholar
Digital Library
- W. Fan, F. Geerts, J. Li, and M. Xiong. Discovering conditional functional dependencies. TKDE, 23(4):683--698, 2011. Google Scholar
Digital Library
- M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, 1979. Google Scholar
Digital Library
- G. Grahne. The Problem of Incomplete Information in Relational Databases. Springer, 1991. Google Scholar
Digital Library
- M. Grohe and G. Schwandtner. The complexity of datalog on linear orders. Logical Methods in Computer Science, 5(1), 2009.Google Scholar
- T. ImieliŃski and W. Lipski, Jr. Incomplete information in relational databases. JACM, 31(4), 1984. Google Scholar
Digital Library
- Knowledge Integrity. Two sides to data decay. DM Review, 2003.Google Scholar
- P. G. Kolaitis. Schema mappings, data exchange, and metadata management. In PODS, 2005. Google Scholar
Digital Library
- M. Koubarakis. Database models for infinite and indefinite temporal information. Inf. Syst., 19(2):141--173, 1994. Google Scholar
Digital Library
- M. Koubarakis. The complexity of query evaluation in indefinite temporal constraint databases. TCS, 171(1-2):25--60, 1997. Google Scholar
Digital Library
- M. W. Krentel. Generalizations of Opt P to the polynomial hierarchy. TCS, 97(2):183--198, 1992. Google Scholar
Digital Library
- M. Lenzerini. Data integration: A theoretical perspective. In PODS, 2002. Google Scholar
Digital Library
- C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.Google Scholar
- E. Schwalb and L. Vila. Temporal constraints: A survey. Constraints, 3(2-3):129--149, 1998. Google Scholar
Digital Library
- R. T. Snodgrass. Developing Time-Oriented Database Applications in SQL. Morgan Kaufmann, 1999. Google Scholar
Digital Library
- R. van der Meyden. The complexity of querying indefinite data about linearly ordered domains. JCSS, 54(1), 1997. Google Scholar
Digital Library
- R. van der Meyden. Logical approaches to incomplete information: A survey. In J. Chomicki and G. Saake, editors, Logics for Databases and Information Systems. Kluwer, 1998. Google Scholar
Digital Library
- V. Vianu. Dynamic functional dependencies and database aging. J. ACM, 34(1):28--59, 1987. Google Scholar
Digital Library
- H. Zhang, Y. Diao, and N. Immerman. Recognizing patterns in streams with imprecise timestamps. In VLDB, 2010. Google Scholar
Digital Library
Index Terms
Determining the currency of data
Recommendations
Determining the Currency of Data
Data in real-life databases become obsolete rapidly. One often finds that multiple values of the same entity reside in a database. While all of these values were once correct, most of them may have become stale and inaccurate. Worse still, the values ...
Data currency in replicated DHTs
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of dataDistributed Hash Tables (DHTs) provide a scalable solution for data sharing in P2P systems. To ensure high data availability, DHTs typically rely on data replication, yet without data currency guarantees. Supporting data currency in replicated DHTs is ...
A Model of Economy Operation Under Currency Market Rate
AbstractA model of functioning of an open economy with a currency market is presented. It allows determining macroeconomic indicators in the medium term with the Central Bank monetary regulation of the amount of money in circulation, interest rates, and ...






Comments