Contact The DL Team Contact Us | Switch to tabbed view

top of pageABSTRACT

Dremel is a scalable, interactive ad hoc query system for analysis of read-only nested data. By combining multilevel execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. The system scales to thousands of CPUs and petabytes of data, and has thousands of users at Google. In this paper, we describe the architecture and implementation of Dremel, and explain how it complements MapReduce-based computing. We present a novel columnar storage representation for nested records and discuss experiments on few-thousand node instances of the system.

top of pageSOURCE MATERIALS

AVAILABLE FOR DOWNLOAD
SIGN IN to get this Article
HtmlHtml (3 KB)  
PDFPDF (3.47 MB)  
Digital EditionDigital Edition  

top of pageAUTHORS



Sergey Melnik Sergey Melnik

Home page
melnikatgoogle.com
Bibliometrics: publication history
Publication years2000-2017
Publication count38
Citation Count1,968
Available for download25
Downloads (6 Weeks)326
Downloads (12 Months)3,539
Downloads (cumulative)44,294
Average downloads per article1,771.76
Average citations per article51.79
View colleagues of Sergey Melnik


Author image not provided  Andrey Gubarev

No contact information provided yet.

Bibliometrics: publication history
Publication years2010-2017
Publication count7
Citation Count582
Available for download6
Downloads (6 Weeks)428
Downloads (12 Months)9,259
Downloads (cumulative)35,522
Average downloads per article5,920.33
Average citations per article83.14
View colleagues of Andrey Gubarev


Author image not provided  Jing Jing Long

No contact information provided yet.

Bibliometrics: publication history
Publication years2010-2011
Publication count2
Citation Count184
Available for download2
Downloads (6 Weeks)13
Downloads (12 Months)163
Downloads (cumulative)4,334
Average downloads per article2,167.00
Average citations per article92.00
View colleagues of Jing Jing Long


Author image not provided  Geoffrey Romer

No contact information provided yet.

Bibliometrics: publication history
Publication years2010-2011
Publication count2
Citation Count184
Available for download2
Downloads (6 Weeks)13
Downloads (12 Months)163
Downloads (cumulative)4,334
Average downloads per article2,167.00
Average citations per article92.00
View colleagues of Geoffrey Romer


Author image not provided  Shiva Shivakumar

No contact information provided yet.

Bibliometrics: publication history
Publication years2010-2011
Publication count2
Citation Count184
Available for download2
Downloads (6 Weeks)13
Downloads (12 Months)163
Downloads (cumulative)4,334
Average downloads per article2,167.00
Average citations per article92.00
View colleagues of Shiva Shivakumar


Author image not provided  Matt Tolton

No contact information provided yet.

Bibliometrics: publication history
Publication years2010-2011
Publication count2
Citation Count184
Available for download2
Downloads (6 Weeks)13
Downloads (12 Months)163
Downloads (cumulative)4,334
Average downloads per article2,167.00
Average citations per article92.00
View colleagues of Matt Tolton


Author image not provided  Theo Vassilakis

No contact information provided yet.

Bibliometrics: publication history
Publication years2005-2011
Publication count3
Citation Count193
Available for download3
Downloads (6 Weeks)15
Downloads (12 Months)176
Downloads (cumulative)5,798
Average downloads per article1,932.67
Average citations per article64.33
View colleagues of Theo Vassilakis

top of pageREFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
 
5
 
6
BigQuery. http://code.google.com/apis/bigquery.
 
7
8
 
9
10
11
 
12
13
14
 
15
Hadoop Apache Project. http://hadoop.apache.org.
 
16
Hive. http://wiki.apache.org/hadoop/Hive, 2009.
17
 
18
19
20
 
21
 
22
Protocol Buffers: Developer Guide. Available at http://code.google.com/apis/protocolbuffers/docs/overview.html.
23
 
24

top of pageCITED BY

14 Citations

 
 
 
 
 
 
 
 

top of pageINDEX TERMS

The ACM Computing Classification System (CCS rev.2012)

Note: Larger/Darker text within each node indicates a higher relevance of the materials to the taxonomic classification.

top of pagePUBLICATION

Title Communications of the ACM CACM Homepage table of contents archive
Volume 54 Issue 6, June 2011
Pages 114-123
Publication Date2011-06-01 (yyyy-mm-dd)
PublisherACM New York, NY, USA
ISSN: 0001-0782 EISSN: 1557-7317 doi>10.1145/1953122.1953148

top of pageREVIEWS


Reviews are not available for this item
Computing Reviews logo

top of pageCOMMENTS

Be the first to comment To Post a comment please sign in or create a free Web account

top of pageTable of Contents

Communications of the ACM

Volume 54 Issue 6, June 2011

Table of Contents
Computing and India
P. J. Narayanan, Anand Deshpanda
Pages: 5-5
doi>10.1145/1953122.1953123
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition
DEPARTMENT: Letters to the editor
Why concurrent objects are recurrently complicated
CACM Staff
Pages: 6-6
doi>10.1145/1953122.1953124
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition
In the Virtual Extension
CACM Staff
Pages: 7-7
doi>10.1145/1953122.1953125
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

To ensure the timely publication of articles, Communications created the Virtual Extension (VE) to expand the page limitations of the print edition by bringing readers the same high-quality articles in an online-only format. VE articles undergo ...
expand
DEPARTMENT: [email protected]
Simple design; research vs. teaching; and quest to learn
Daniel Reed, Mark Guzdial, Judy Robertson
Pages: 8-9
doi>10.1145/1953122.1953126
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the [email protected] community. In each issue of Communications, we'll publish selected posts or excerpts.twitterFollow us on Twitter at http://twitter.com/blogCACMhttp://cacm.acm.org/blogs/blog-cacmDaniel ...
expand
DEPARTMENT: CACM online
Say it with video
Scott E. Delman
Pages: 10-10
doi>10.1145/1953122.1953127
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition
COLUMN: News
Biology-inspired networking
Kirk L. Kroeker
Pages: 11-13
doi>10.1145/1953122.1953128
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Researchers have developed a new networking algorithm, modeled after the neurological development of the fruit fly, to help distributed networks self-organize more efficiently.
expand
Beauty and elegance
Gary Anthes
Pages: 14-15
doi>10.1145/1953122.1953131
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Leslie Valiant talks about machine learning; parallel computing, and his quest for simplicity.
expand
The promise of flexible displays
Tom Geller
Pages: 16-18
doi>10.1145/1953122.1953130
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

New screen materials could lead to portable devices that are anything but rectangular, flat, and unbendable.
expand
Unlimited possibilities
Gregory Goth
Pages: 19-19
doi>10.1145/1953122.1953132
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

M. Frans Kaashoek discusses systems work, "undo computing," and what he learned from Andrew S. Tanenbaum.
expand
All the news that's fit for you
Marina Krakovsky
Pages: 20-21
doi>10.1145/1953122.1953129
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Personalized news promises to make daily journalism profitable again, but technical and cultural obstacles have slowed the industry's adoption of automated personalization.
expand
COLUMN: Privacy and security
Identity management and privacy: a rare opportunity to get it right
Ari Schwartz
Pages: 22-24
doi>10.1145/1953122.1953134
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

The National Strategy for Trusted Identities in Cyberspace represents a shift in the way the U.S. government is approaching identity management, privacy, and the Internet.
expand
COLUMN: The profession of IT
Who are we---now?
Peter J. Denning, Dennis J. Frailey
Pages: 25-27
doi>10.1145/1953122.1953133
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Considerable progress has been made toward the formation of a computing profession since we started tracking it in this column a decade ago.
expand
COLUMN: The business of software
Practical application of theoretical estimation
Phillip G. Armour
Pages: 28-30
doi>10.1145/1953122.1953135
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Estimation models at the extreme.
expand
COLUMN: Inside risks
The risks of stopping too soon
David Lorge Parnas
Pages: 31-33
doi>10.1145/1953122.1953136
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Good software design is never easy, but stopping too soon makes the job more difficult.
expand
COLUMN: Kode Vicious
Think before you fork
George V. Neville-Neil
Pages: 34-35
doi>10.1145/1953122.1953137
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

KV's thoughts on forking, config files, and using internal wikis.
expand
COLUMN: Viewpoint
Computer science can use more science
Clayton T. Morrison, Richard T. Snodgrass
Pages: 36-38
doi>10.1145/1953122.1953139
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Software developers should use empirical methods to analyze their designs to predict how working systems will behave.
expand
SECTION: Practice
If you have too much data, then 'good enough' is good enough
Pat Helland
Pages: 40-47
doi>10.1145/1953122.1953140
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

In today's humongous database systems, clarity may be relaxed, but business needs can still be met.
expand
Scalable SQL
Michael Rys
Pages: 48-53
doi>10.1145/1953122.1953141
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

How do large-scale sites and applications remain SQL-based?
expand
Does deterrence work in reducing information security policy abuse by employees?
Qing Hu, Zhengchuan Xu, Tamara Dinev, Hong Ling
Pages: 54-60
doi>10.1145/1953122.1953142
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Methods for evaluating and effectively managing the security behavior of employees.
expand
SECTION: Contributed articles
Advancing the state of home networking
W. Keith Edwards, Rebecca E. Grinter, Ratul Mahajan, David Wetherall
Pages: 62-71
doi>10.1145/1953122.1953143
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Before building the network or its components, first understand the home and the behavior of its human inhabitants.
expand
10 rules for scalable performance in 'simple operation' datastores
Michael Stonebraker, Rick Cattell
Pages: 72-80
doi>10.1145/1953122.1953144
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Partition data and operations, keep administration simple, do not assume one size fits all.
expand
Specification and verification: the Spec# experience
Mike Barnett, Manuel Fähndrich, K. Rustan M. Leino, Peter Müller, Wolfram Schulte, Herman Venter
Pages: 81-91
doi>10.1145/1953122.1953145
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Can a programming language really help programmers write better programs?
expand
SECTION: Contributed articles: Virtual extension
Viscous democracy for social networks
Paolo Boldi, Francesco Bonchi, Carlos Castillo, Sebastiano Vigna
Pages: 129-137
doi>10.1145/1953122.1953154
Full text: HtmlHtml  PDFPDF

Decision-making procedures in online social networks should reflect participants' political influence within the network.
expand
Wireless on the precipice: The 14th century revisited
Denise Mcmanus, Houston Carr, Benjamin Adams
Pages: 138-143
doi>10.1145/1953122.1953155
Full text: HtmlHtml  PDFPDF

Business continuity plans for the wireless world must address solar activity.
expand
SECTION: Review articles
PageRank: standing on the shoulders of giants
Massimo Franceschet
Pages: 92-101
doi>10.1145/1953122.1953146
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

The roots of Google's PageRank can be traced back to several early, and equally remarkable, ranking techniques.
expand
SECTION: Research highlights
The quest for a logic for polynomial-time computation: technical perspective
Phokion G. Kolaitis
Pages: 103-103
doi>10.1145/1953122.1953149
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition
From polynomial time queries to graph structure theory
Martin Grohe
Pages: 104-112
doi>10.1145/1953122.1953150
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

We give a logical characterization of the polynomial-time properties of graphs with excluded minors: For every class C of graphs such that some graph H is not a minor of any graph in C, a property P of graphs in C is ...
expand
Data analysis at astonishing speed: technical perspective
Michael J. Franklin
Pages: 113-13
doi>10.1145/1953122.1953147
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition
Dremel: interactive analysis of web-scale datasets
Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis
Pages: 114-123
doi>10.1145/1953122.1953148
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Dremel is a scalable, interactive ad hoc query system for analysis of read-only nested data. By combining multilevel execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. The system ...
expand
COLUMN: Last byte
Puzzled: Solutions and sources
Peter Winkler
Pages: 126-126
doi>10.1145/1953122.1953151
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Last month (May 2011, p. 120) we posted a trio of brainteasers, including one as yet unsolved, concerning games and their roles and turns, with randomness either removed or inserted. Here, we offer solutions to two of them. How did you do?
expand
Q&A: A lifelong learner
Leah Hoffmann
Pages: 128-ff
doi>10.1145/1953122.1953152
Full text: HtmlHtml  PDFPDF
Other formats: Digital EditionDigital Edition

Leslie Valiant discusses machine learning, parallel computing, and computational neuroscience.
expand

Powered by The ACM Guide to Computing Literature


The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2019 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us