Abstract
The relatively high cost of record deserialization is increasingly becoming the bottleneck of column-based storage systems in tree-structured applications [58]. Due to record transformation in the storage layer, unnecessary processing costs derived from fields and rows irrelevant to queries may be very heavy in nested schemas, significantly wasting the computational resources in large-scale analytical workloads. This leads to the question of how to reduce both the deserialization and IO costs of queries with highly selective filters following arbitrary paths in a nested schema.
We present CORES (Column-Oriented Regeneration Embedding Scheme) to push highly selective filters down into column-based storage engines, where each filter consists of several filtering conditions on a field. By applying highly selective filters in the storage layer, we demonstrate that both the deserialization and IO costs could be significantly reduced. We show how to introduce fine-grained composition on filtering results. We generalize this technique by two pair-wise operations, rollup and drilldown, such that a series of conjunctive filters can effectively deliver their payloads in nested schema. The proposed methods are implemented on an open-source platform. For practical purposes, we highlight how to build a column storage engine and how to drive a query efficiently based on a cost model. We apply this design to the nested relational model especially when hierarchical entities are frequently required by ad hoc queries. The experiments, including a real workload and the modified TPCH benchmark, demonstrate that CORES improves the performance by 0.7×--26.9× compared to state-of-the-art platforms in scan-intensive workloads.
- Apache. 2017. Apache Hive TM. Retrieved June 13, 2019 from https://hive.apache.orgGoogle Scholar
- Apache. 2017. Apache Parquet. Retrieved June 13, 2019 from https://parquet.apache.org.Google Scholar
- Apache. 2017. Apache Spark. Retrieved June 13, 2019 from https://spark.apache.org.Google Scholar
- Apache. 2017. Apache Tez. Retrieved June 13, 2019 from https://tez.apache.org.Google Scholar
- Apache. 2018. Apache AsterixDB. Retrieved June 13, 2019 from https://asterixdb.apache.org.Google Scholar
- Apache. 2018. Apache Avro. Retrieved June 13, 2019 from https://avro.apache.org.Google Scholar
- Google. 2017. Protocol buffer. Retrieved June 13, 2019 from http://code.google.com/p/protobuf/.Google Scholar
- Yang Li. 2018. Cores. Retrieved June 13, 2019 from https://github.com/lwhay/cores.Google Scholar
- NCBI. 2018. PubMed. Retrieved June 13, 2019 from http://www.ncbi.nlm.nih.gov.Google Scholar
- TPC. 2017. TPC-H benchmark. Retrieved June 13, 2019 from http://www.tpc.org/tpch.Google Scholar
- Foto N. Afrati, Dan Delorey, Mosha Pasumansky, and Jeffrey D. Ullman. 2014. Storing and querying tree-structured records in Dremel. PVLDB 7, 12 (2014), 1131--1142. Google Scholar
Digital Library
- Anastassia Ailamaki, David J. Dewitt, and Mark D. Hill. 2002. Data Page Layouts for Relational Databases on Deep Memory Hierarchies. Springer-Verlag New York, Inc. 198--215.Google Scholar
- Sattam Alsubaiee, Yasser Altowim, et al. 2014. AsterixDB: a scalable, open source BDMS. PVLDB 7, 14 (2014), 1905--1916. Google Scholar
Digital Library
- Sattam Alsubaiee, Alexander Behm, Vinayak R. Borkar, et al. 2014. Storage management in AsterixDB. PVLDB 7, 10 (2014), 841--852. Google Scholar
Digital Library
- Gopi Attaluri, Shaorong Liu, and Guy M. Lohman. 2013. DB2 with BLU acceleration: So much more than just a column store. Proceedings of the VLDB Endowment 6, 11 (2013), 1080--1091. Google Scholar
Digital Library
- François Bancilhon, Philippe Richard, and Michel Scholl. 1982. On line processing of compacted relations. In Proceedings of the 8th International Conference on Very Large Data Bases. 263--269. Google Scholar
Digital Library
- Babak Behzad, Huong Vu Thanh Luu, et al. 2013. Taming parallel I/O complexity with auto-tuning. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis. 68--79. Google Scholar
Digital Library
- Kevin Beyer and Raghu Ramakrishnan. 1999. Bottom-up computation of sparse and iceberg CUBEs. Sigmod Record 28, 2 (1999), 359--370.Google Scholar
Digital Library
- Medha Bhadkamkar, Fernando Farfan, Vagelis Hristidis, and Raju Rangaswami. 2009. Storing semi-structured data on disk drives. Trans. Storage 5, 2, Article 6 (June 2009), 35 pages. Google Scholar
Digital Library
- Peter Boncz, Torsten Grust, Maurice Van Keulen, Stefan Manegold, Jan Rittinger, and Jens Teubner. 2006. MonetDB/XQuery: A fast XQuery processor powered by a relational engine. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 479--490.Google Scholar
Digital Library
- Vinayak Borkar, Michael Carey, Raman Grover, et al. 2011. Hyracks: A flexible and extensible foundation for data-intensive computing. In Proceedings of the IEEE International Conference on Data Engineering. 1151--1162. Google Scholar
Digital Library
- C. Chasseur, Yinan Li, and J. M. Patel. 2013. Enabling JSON document stores in relational systems (long version). In Proceedings of the International Workshop on the Web and Databases. 1--16.Google Scholar
- Shuo-Han Chen, Tseng-Yi Chen, Yuan-Hao Chang, Hsin-Wen Wei, and Wei-Kuan Shih. 2018. UnistorFS: A union storage file system design for resource sharing between memory and storage on persistent RAM-based systems. ACM Trans. Storage 14, 1, Article 3 (Feb. 2018), 22 pages. Google Scholar
Digital Library
- Douglas W. Comer and Philip S. Yu. 1987. A vertical partitioning algorithm for relational databases. In Proceedings of the IEEE International Conference on Data Engineering. 30--35.Google Scholar
- Graham Cormode, Minos Garofalakis, et al. 2012. Synopses for massive data: Samples, histograms, wavelets, sketches. Found. 8 Trends Datab. 4, 1 (2012), 1--294. Google Scholar
Digital Library
- Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation. 10. Google Scholar
Digital Library
- M. J. Egenhofer. 1994. Spatial SQL: A query and presentation language. IEEE Trans. Knowl. Data Eng. 6, 1 (1994), 86--95. Google Scholar
Digital Library
- Avrilia Floratou and Umar Farooq Minhas. 2014. SQL-on-Hadoop: Full circle back to shared-nothing database architectures. Proceedings of the VLDB Endowment 7, 12 (Jan. 2014), 1295--1306. Google Scholar
Digital Library
- Raúl Gracia-Tinedo, Josep Sampé, et al. 2017. Crystal: Software-defined storage for multi-tenant object stores. In Proceedings of the USENIX Conference on File and Storage Technologies. Google Scholar
Digital Library
- Bin He, Hui I. Hsiao, Ziyang Liu, Yu Huang, and Yi Chen. 2012. Efficient iceberg query evaluation using compressed bitmap index. IEEE Trans. Knowl. Data Eng. 24, 9 (2012), 1570--1583. Google Scholar
Digital Library
- Jianfeng Jia, Chen Li, and Michael J. Carey. 2017. Drum: A rhythmic approach to interactive analytics on large data. In Proceedings of the IEEE International Conference on Big Data.Google Scholar
- Martin Kaufmann and Donald Kossmann. 2013. Storing and processing temporal data in a main memory column store. In Proceedings of the VLDB Endowment 6, 12 (2013), 1444--1449. Google Scholar
Digital Library
- Tirthankar Lahiri, Shasank Chavan, Maria Colgan, Dinesh Das, Amit Ganesh, Mike Gleeson, Sanket Hase, Allison Holloway, Jesse Kamp, and Teck Hua Lee. 2015. Oracle database in-memory: A dual format in-memory database. In Proceedings of the IEEE International Conference on Data Engineering. 1253--1258.Google Scholar
Cross Ref
- Andrew Lamb, Matt Fuller, Ramakrishna Varadarajan, et al. 2012. The vertica analytic database: C-store 7 years later. PVLDB 5, 12 (2012), 1790--1801. Google Scholar
Digital Library
- Eunji Lee and Hyokyung Bahn. 2014. Caching strategies for high-performance storage media. Trans. Storage 10, 3, Article 11 (Aug. 2014), 22 pages. Google Scholar
Digital Library
- Daniel Lemire, Robert Godin, et al. 2016. Optimizing Druid with Roaring bitmaps. In Proceedings of the International Database Engineering 8 Applications Symposium. 77--86. Google Scholar
Digital Library
- Hang Liu and H. Howie Huang. 2017. Graphene: Fine-grained IO management for graph computing. In Proceedings of the USENIX Conference on File and Storage Technologies. 285--299. Google Scholar
Digital Library
- Zhen Hua Liu, Beda Hammerschmidt, and Doug Mcmahon. 2014. JSON data management: Supporting schema-less development in RDBMS. In Proceedings of the ACM SIGMOD International Conference on Management of Data 7, 2 (2014), 1247--1258.Google Scholar
Digital Library
- Peng Lu, Sai Wu, Lidan Shou, and Kian-Lee Tan. 2013. An efficient and compact indexing scheme for large-scale data store. In Proceedings of the IEEE International Conference on Data Engineering. 326--337. Google Scholar
Digital Library
- Sagar S. Mane and M. Emmanuel. 2015. Review and comparative study of bitmap indexing techniques. Data Mining Knowl. Eng. 7, 1 (2015).Google Scholar
- Sergey Melnik, Andrey Gubarev, Jing Jing Long, et al. 2010. Dremel: Interactive analysis of web-scale datasets. Commun. ACM 3, 12 (2010), 114--123.Google Scholar
- Jan Paredaens and Dirk Van Gucht. 1988. Possibilities and limitations of using flat operators in nested algebra expressions. In Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 29--38. Google Scholar
Digital Library
- H. B. Paul, H. J. Schek, and M. H. Scholl. 1987. Architecture and implementation of the Darmstadt database kernel system. In Proceedings of the ACM SIGMOD Conference. 196--207. Google Scholar
Digital Library
- Mark A. Roth, Herry F. Korth, and Abraham Silberschatz. 1988. Extended algebra and calculus for nested relational databases. ACM Trans. Datab. Syst. 13, 4 (1988), 389--417. Google Scholar
Digital Library
- Michael Rys and Gerhard Weikum. 1994. Heuristic optimization of speedup and benefit/cost for parallel database scans on shared-memory multiprocessors. In Proceedings of the International Parallel Processing Symposium. 894--901. Google Scholar
Digital Library
- Marc H. Scholl, H.-Bernhard Paul, and Hans-Jörg Schek. 1987. Supporting flat relations by a nested relational kernel. In Proceedings of the International Conference on Very Large Data Bases. 137--146. Google Scholar
Digital Library
- Anil Shanbhag, Alekh Jindal, Yi Lu, and Samuel Madden. 2016. A moeba: A shape changing storage system for big data. PVLDB 9, 13 (2016), 1569--1572. Google Scholar
Digital Library
- Jeff Shute, Radek Vingralek, et al. 2013. F1: A distributed SQL database that scales. Proceedings of the VLDB Endowment 6, 11 (2013), 1068--1079. Google Scholar
Digital Library
- Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler. 2010. The Hadoop distributed file system. In Proceedings of the IEEE Symposium on MASS Storage Systems and Technologies. 1--10. Google Scholar
Digital Library
- Laure Soulier and Lynda Tamine. 2017. On the collaboration support in information retrieval. ACM Comput. Surv. 50, 4, Article 51 (2017), 34 pages. Google Scholar
Digital Library
- Kurt Stockinger. 2001. Design and implementation of bitmap indices for scientific data. In Proceedings of the International Database Engineering and Applications Symposium. 47--57. Google Scholar
Digital Library
- Mike Stonebraker, Daniel J. Abadi, Adam Batkin, et al. 2005. C-store: A column-oriented DBMS. In Proceedings of the International Conference on Very Large Data Bases. 553--564. Google Scholar
Digital Library
- Liwen Sun, Sanjay Krishnan, Reynold S. Xin, and Michael J. Franklin. 2014. A partitioning framework for aggressive data skipping. PVLDB 7, 13 (2014), 1617--1620. Google Scholar
Digital Library
- Yuliang Sun, Yu Wang, and Huazhong Yang. 2018. Bidirectional database storage and SQL query exploiting RRAM-based process-in-memory structure. ACM Trans. Storage 14, 1, Article 8 (March 2018), 19 pages. Google Scholar
Digital Library
- Daniel Tahara, Thaddeus Diamond, and Daniel J. Abadi. 2014. Sinew: A SQL system for multi-structured data. In Proceedings of ACM SIGMOD Conference. 815--826. Google Scholar
Digital Library
- Aubrey L. Tatarowicz, Carlo Curino, Evan P. C. Jones, and Sam Madden. 2012. Lookup tables: Fine-grained partitioning for distributed databases. In Proceedings of the IEEE International Conference on Data Engineering. 102--113. Google Scholar
Digital Library
- Sebastian Wandelt, Dong Deng, Stefan Gerdjikov, et al. 2014. State-of-the-art in string similarity search and join. SIGMOD Record 43, 1 (2014), 64--76. Google Scholar
Digital Library
- Zhiyi Wang and Shimin Chen. 2017. Exploiting common patterns for tree-structured data. In Proceedings of the ACM SIGMOD Conference. 883--896. Google Scholar
Digital Library
- Brent Welch, Marc Unangst, Zainul Abbasi, et al. 2008. Scalable performance of the Panasas parallel file system. In Proceedings of the USENIX Conference on File and Storage Technologies. 2. Google Scholar
Digital Library
- Chin-Hsien Wu and Kuo-Yi Huang. 2015. Data sorting in flash memory. Trans. Storage 11, 2, Article 7 (March 2015), 25 pages. Google Scholar
Digital Library
- Pengfei Xuan, Walter B. Ligon, Pradip K. Srimani, Rong Ge, and Feng Luo. 2016. Accelerating big data analytics on HPC clusters using two-level storage. Parallel Comput. 61 (2016).Google Scholar
- Atsuo Yoshitaka and Tadao Ichikawa. 1999. A survey on content-based retrieval for multimedia databases. IEEE Trans. Knowl. Data Eng. 11, 1 (1999), 81--93. Google Scholar
Digital Library
- Yuan Yu, Michael Isard, Dennis Fetterly, et al. 2009. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation. 1--14. Google Scholar
Digital Library
- Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, et al. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation. 2. Google Scholar
Digital Library
- Yansong Zhang, Xuan Zhou, Ying Zhang, et al. 2016. Virtual denormalization via array index reference for main memory OLAP. IEEE Trans. Knowl. Data Eng. 28, 4 (2016), 1061--1074. Google Scholar
Digital Library
Index Terms
CORES: Towards Scan-Optimized Columnar Storage for Nested Records
Recommendations
Query processing techniques for solid state drives
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataSolid state drives perform random reads more than 100x faster than traditional magnetic hard disks, while offering comparable sequential read and write bandwidth. Because of their potential to speed up applications, as well as their reduced power ...
A flash-based decomposition storage model
DASFAA'12: Proceedings of the 17th international conference on Database Systems for Advanced ApplicationsThe traditional HDD-based columnar storage is an important technology to improve the performance of query-intensive database. However, some features of HDD weaken the advantages of columnar storage. In this paper, we study the advantages of SSD over HDD ...
Accelerating Columnar Storage Based on Asynchronous Skipping Strategy
AbstractMany database applications, such as OnLine Analytical Processing (OLAP), web-based information extraction or scientific computation, need to select a subset of fields based on several user-defined filters. Developers of these ...






Comments