Author image not provided
 Volker Markl

Authors:
Add personal information
  Affiliation history
Bibliometrics: publication history
Average citations per article6.66
Citation Count526
Publication count79
Publication years2011-2019
Available for download73
Average downloads per article310.21
Downloads (cumulative)22,645
Downloads (12 Months)3,417
Downloads (6 Weeks)443
SEARCH
ROLE
Arrow RightAuthor only
· Editor only
· All roles


AUTHOR'S COLLEAGUES
See all colleagues of this author




BOOKMARK & SHARE


44 results found Export Results: bibtexendnoteacmrefcsv

Result 1 – 20 of 44
Result page: 1 2 3

Sort by:

1 published by ACM
May 2019 ACM SIGMOD Record: Volume 47 Issue 4, December 2018
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 3,   Downloads (12 Months): 3,   Downloads (Overall): 3

Full text available: PDFPDF
Data management systems research at TU Berlin is spearheaded by the Database Systems and Information Management (DIMA) Group, the Big Data Management (Big- DaMa) Group, as well as the affiliated Intelligent Analytics for Massive Data (IAM) Research Group at the German Research Center for Artificial Intelligence (DFKI). Jointly, our research ...

2 published by ACM
January 2019 ACM Transactions on Database Systems (TODS) - Best of EDBT 2017, Best of SIGMOD 2016 and Regular Papers: Volume 44 Issue 1, January 2019
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 15,   Downloads (12 Months): 16,   Downloads (Overall): 16

Full text available: HtmlHtml  PDFPDF
Parallel dataflow engines such as Apache Hadoop, Apache Spark, and Apache Flink are an established alternative to relational databases for modern data analysis applications. A characteristic of these systems is a scalable programming model based on distributed collections and parallel transformations expressed by means of second-order functions such as map ...
Keywords: MapReduce, Parallel dataflows, monad comprehensions

3
December 2018 The VLDB Journal — The International Journal on Very Large Data Bases: Volume 27 Issue 6, December 2018
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 14,   Downloads (12 Months): 15,   Downloads (Overall): 15

Full text available: PDFPDF
The concept of state and its applications vary widely across big data processing systems. This is evident in both the research literature and existing systems, such as Apache Flink, Apache Heron, Apache Samza, Apache Spark, and Apache Storm. Given the pivotal role that state management plays, particularly, for iterative batch ...
Keywords: Big data processing systems, State management, Survey

4
December 2018 The VLDB Journal — The International Journal on Very Large Data Bases: Volume 27 Issue 6, December 2018
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 4,   Downloads (12 Months): 4,   Downloads (Overall): 4

Full text available: PDFPDF
Processor manufacturers build increasingly specialized processors to mitigate the effects of the power wall in order to deliver improved performance. Currently, database engines have to be manually optimized for each processor which is a costly and error- prone process. In this paper, we propose concepts to adapt to and to ...
Keywords: CPU, Code generation, Code variants, Database query processing, Database systems, GPU, Heterogeneous processors, MIC, Query compilation, Variant optimization

5 published by ACM
October 2018 SoCC '18: Proceedings of the ACM Symposium on Cloud Computing
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 6,   Downloads (12 Months): 79,   Downloads (Overall): 79

Full text available: PDFPDF
To cope with today's large scale of data, parallel dataflow engines such as Hadoop, and more recently Spark and Flink, have been proposed. They offer scalability and performance, but require data scientists to develop analysis pipelines in unfamiliar programming languages and abstractions. To overcome this hurdle, dataflow engines have introduced ...
Keywords: Data Exchange, Dataflow Engines, Language Integration

6 published by ACM
June 2018 DEBS '18: Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 9,   Downloads (12 Months): 144,   Downloads (Overall): 144

Full text available: PDFPDF
The global database research community has greatly impacted the functionality and performance of data storage and processing systems along the dimensions that define "big data", i.e., volume, velocity, variety, and veracity. Locally, over the past five years, we have also been working on varying fronts. Among our contributions are: (1) ...
Keywords: Apache Flink, big data, data science, declarative languages, federation, heterogeneous data management

7 published by ACM
June 2018 DAMON '18: Proceedings of the 14th International Workshop on Data Management on New Hardware
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 10,   Downloads (12 Months): 12,   Downloads (Overall): 12

Full text available: PDFPDF
k -Means is a versatile clustering algorithm widely-used in practice. To cluster large data sets, state-of-the-art implementations use GPUs to shorten the data to knowledge time. These implementations commonly assign points on a GPU and update centroids on a CPU. We show that this approach has two main drawbacks. First, ...

8 published by ACM
May 2018 SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 21,   Downloads (12 Months): 210,   Downloads (Overall): 210

Full text available: PDFPDF
Query processing on GPU-style coprocessors is severely limited by the movement of data. With teraflops of compute throughput in one device, even high-bandwidth memory cannot provision enough data for a reasonable utilization. Query compilation is a proven technique to improve memory efficiency. However, its inherent tuple-at-a-time processing style does not ...
Keywords: just-in-time query compilation, massively parallel query processing, olap, operator pipelining, query-coprocessing

9 published by ACM
September 2017 SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 7,   Downloads (12 Months): 114,   Downloads (Overall): 339

Full text available: PDFPDF
Real-time sensor data enables diverse applications such as smart metering, traffic monitoring, and sport analysis. In the Internet of Things, billions of sensor nodes form a sensor cloud and offer data streams to analysis systems. However, it is impossible to transfer all available data with maximal frequencies to all applications. ...
Keywords: adaptive sampling, on-demand streaming, oversampling, real-time analysis, sensor data, sensor sharing, user-defined sampling

10 published by ACM
May 2017 BeyondMR'17: Proceedings of the 4th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 4,   Downloads (12 Months): 74,   Downloads (Overall): 323

Full text available: PDFPDF
Distributed data flow systems such as Apache Spark or Apache Flink are popular choices for scaling machine learning algorithms in production. Industry applications of large scale machine learning such as click-through rate prediction rely on models trained on billions of data points which are both highly sparse and high-dimensional. Existing ...

11
February 2017 The VLDB Journal — The International Journal on Very Large Data Bases: Volume 26 Issue 1, February 2017
Publisher: Springer-Verlag New York, Inc.
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 1,   Downloads (12 Months): 2,   Downloads (Overall): 2

Full text available: PDFPDF

12
February 2017 The VLDB Journal — The International Journal on Very Large Data Bases: Volume 26 Issue 1, February 2017
Publisher: Springer-Verlag New York, Inc.
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 9,   Downloads (12 Months): 29,   Downloads (Overall): 29

Full text available: PDFPDF

13 published by ACM
October 2016 CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 11,   Downloads (12 Months): 36,   Downloads (Overall): 223

Full text available: PDFPDF
Aggregation queries on data streams are evaluated over evolving and often overlapping logical views called windows. While the aggregation of periodic windows were extensively studied in the past through the use of aggregate sharing techniques such as Panes and Pairs, little to no work has been put in optimizing the ...
Keywords: data stream aggregation, data stream optimisation, data stream processing, data stream windows, data streams, data structures, databases, functional programming, operator sharing, programming models, user-defined functions

14 published by ACM
July 2016 SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 3
Downloads (6 Weeks): 8,   Downloads (12 Months): 54,   Downloads (Overall): 262

Full text available: PDFPDF
Mathematical formulae are essential in science, but face challenges of ambiguity, due to the use of a small number of identifiers to represent an immense number of concepts. Corresponding to word sense disambiguation in Natural Language Processing, we disambiguate mathematical identifiers. By regarding formulae and natural text as one monolithic ...
Keywords: MIR, MLP, definitions, identifiers, mathematical information retrieval, mathematical knowledge management, mathematical language processing, mathematics, mathoid, mathosphere, namespace discovery, wikipedia

15 published by ACM
June 2016 SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 3,   Downloads (12 Months): 27,   Downloads (Overall): 232

Full text available: PDFPDF
Parallel dataflow APIs based on second-order functions were originally seen as a flexible alternative to SQL. Over time, however, their complexity increased due to the number of physical aspects that had to be exposed by the underlying engines in order to facilitate efficient execution. To retain a sufficient level of ...
Keywords: data-parallel execution, emma, large-scale data analysis, mapreduce, monad comprehensions, parallel dataflows, scala macros

16 published by ACM
June 2016 BeyondMR '16: Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond
Publisher: ACM
Bibliometrics:
Citation Count: 4
Downloads (6 Weeks): 3,   Downloads (12 Months): 25,   Downloads (Overall): 225

Full text available: PDFPDF
Advanced data analysis typically requires some form of pre-processing in order to extract and transform data before processing it with machine learning and statistical analysis techniques. Pre-processing pipelines are naturally expressed in dataflow APIs (e.g., MapReduce, Flink, etc.), while machine learning is expressed in linear algebra with iterations. Programmers therefore ...

17 published by ACM
June 2016 JCDL '16: Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 7,   Downloads (12 Months): 86,   Downloads (Overall): 86

Full text available: PDFPDF
Literature recommender systems support users in filtering the vast and increasing number of documents in digital libraries and on the Web. For academic literature, research has proven the ability of citation-based document similarity measures, such as Co-Citation (CoCit), or Co-Citation Proximity Analysis (CPA) to improve recommendation quality. In this paper, ...
Keywords: citation analysis, co-citation, digital libraries, large-scale evaluations, literature recommendations, link-based, co-citation proximity analysis, document similarity measures, big data

18 published by ACM
June 2016 ACM SIGMOD Record: Volume 45 Issue 1, March 2016
Publisher: ACM
Bibliometrics:
Citation Count: 4
Downloads (6 Weeks): 3,   Downloads (12 Months): 18,   Downloads (Overall): 127

Full text available: PDFPDF
Parallel collection processing based on second-order functions such as map and reduce has been widely adopted for scalable data analysis. Initially popularized by Google, over the past decade this programming paradigm has found its way in the core APIs of parallel dataflow engines such as Hadoop's MapReduce, Spark's RDDs, and ...

19
February 2016 The VLDB Journal — The International Journal on Very Large Data Bases: Volume 25 Issue 1, February 2016
Publisher: Springer-Verlag New York, Inc.
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 4,   Downloads (12 Months): 46,   Downloads (Overall): 65

Full text available: PDFPDF
Contemporary RDBMS-based systems for visualization of high-volume numerical data have difficulty to cope with the hard latency requirements and high ingestion rates of interactive visualizations. Existing solutions for lowering the volume of large data sets disregard the spatial properties of visualizations, resulting in visualization errors. In this work, we introduce ...
Keywords: Line rasterization, Overplotting, Data aggregation, Dimensionality reduction, Data visualization, Relational databases, Visual aggregation

20 published by ACM
August 2015 SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 6
Downloads (6 Weeks): 3,   Downloads (12 Months): 34,   Downloads (Overall): 174

Full text available: PDFPDF
Mathematical Information Retrieval concerns retrieving information related to a particular mathematical concept. The NTCIR-11 Math Task develops an evaluation test collection for document sections retrieval of scientific articles based on human generated topics. Those topics involve a combination of formula patterns and keywords. In addition, the optional Wikipedia Task provides ...
Keywords: MIR, NTCIR, benchmark, dataset, lateXML, math information retrieval, math search, mathML, mathoid, task, wikipedia



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2019 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us