column

The BigDAWG Polystore System

Published:12 August 2015Publication History
Skip Abstract Section

Abstract

This paper presents a new view of federated databases to address the growing need for managing information that spans multiple data models. This trend is fueled by the proliferation of storage engines and query languages based on the observation that 'no one size fits all'. To address this shift, we propose a polystore architecture; it is designed to unify querying over multiple data models. We consider the challenges and opportunities associated with polystores. Open questions in this space revolve around query optimization and the assignment of objects to storage engines. We introduce our approach to these topics and discuss our prototype in the context of the Intel Science and Technology Center for Big Data

References

  1. Accumulo. https://accumulo.apache.org/.Google ScholarGoogle Scholar
  2. L. Amsaleg, A. Tomasic, M. J. Franklin, and T. Urhan. Scrambling query plans to cope with unexpected delays. In Fourth International Conference on Parallel and Distributed Information Systems, 1996, pages 208--219. IEEE, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In PODS, pages 1--16. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Batini, M. Lenzerini, and S. B. Navathe. A comparative analysis of methodologies for database schema integration. ACM Computing Surveys, 18(4):323--364, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Bouganim, F. Fabret, C. Mohan, and P. Valduriez. A dynamic query processing architecture for data integration systems. IEEE Data Eng. Bull., 23(2):42--48, 2000.Google ScholarGoogle Scholar
  6. P. G. Brown. Overview of scidb: large scale array storage, processing and analysis. In SIGMOD, pages 963--968. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. J. Carey, L. M. Haas, P. M. Schwarz, M. Arya, W. F. Cody, R. Fagin, M. Flickner, A. W. Luniewski,W. Niblack, and D. Petkovic. Towards heterogeneous multimedia information systems: The Garlic approach. In Data Engineering: Distributed Object Management, pages 124--131. IEEE, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. U. Cetintemel, J. Du, T. Kraska, S. Madden, D. Maier, J. Meehan, A. Pavlo, M. Stonebraker, E. Sutherland, and N. Tatbul. S-Store: A Streaming NewSQL System for Big Velocity Applications. PVLDB, 7(13), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chawathe, H. G. Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. The TSIMMIS Project: Integration of Heterogeneous Information Sources. In IPSJ, 1994.Google ScholarGoogle Scholar
  10. A. Deshpande and J. M. Hellerstein. Decoupled query optimization for federated database systems. In ICDE, pages 716--727. IEEE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. J. DeWitt, A. Halverson, R. Nehme, S. Shankar, J. Aguilar-Saborit, A. Avanes, M. Flasza, and J. Gramling. Split query processing in polybase. SIGMOD, pages 1255--1266, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Franklin, A. Halevy, and D. Maier. From databases to dataspaces: a new abstraction for information management. Sigmod Record, 34(4):27--33, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Halperin, V. Teixeira de Almeida, L. L. Choo, S. Chu, P. Koutris, D. Moritz, J. Ortiz, V. Ruamviboonsuk, J. Wang, A. Whitaker, et al. Demonstration of the Myria big data management service. In SIGMOD. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Hull. Managing semantic heterogeneity in databases: a theoretical prospective. In PODS, pages 51--61. ACM, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Kepner, W. Arcand, W. Bergeron, N. Bliss, R. Bond, C. Byun, G. Condon, K. Gregson, M. Hubbell, and J. Kurz. Dynamic distributed dimensional data model (d4m) database and computation system. In ICASSP. IEEE, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  16. J. LeFevre, J. Sankaranarayanan, H. Hacigümüs, J. Tatemura, N. Polyzotis, and M. J. Carey. MISO: souping up big data query processing with a multistore system. In SIGMOD, pages 1591--1602, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. M. Mackinnon, D. H. Marwick, and M. H. Williams. A model for query decomposition and answer construction in heterogeneous distributed database systems. Journal of Intelligent Information Systems, 11(1):69--87, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Saeed, M. Villarroel, A. T. Reisner, G. Clifford, L.-W. Lehman, G. Moody, T. Heldt, T. H. Kyaw, B. Moody, and R. G. Mark. Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II): A public-access intensive care unit database. Critical Care Medicine, 39:952--960, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  19. P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In SIGMOD, pages 23--34. ACM, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell, C. Staelin, and A. Yu. Mariposa: a wide-area distributed database system. In The VLDB Journal, volume 5, pages 48--63. Springer, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Stonebraker and U. Cetintemel. ¿One Size Fits All': An Idea Whose time has come and gone. In ICDE, pages 2--11, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Taft, M. Vartak, N. R. Satish, N. Sundaram, S. Madden, and M. Stonebraker. Genbase: A complex analytics genomics benchmark. In SIGMOD, pages 177--188. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Wiederhold. Mediators in the architecture of future information systems. Computer, pages 38--49, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

(auto-classified)
  1. The BigDAWG Polystore System

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!