Publications

We try to keep an up to date list of all our publications. If you are interested in a PDF that we have not uploaded yet, feel free to send us an email to get a copy. All recent publications you will find below. For older, please click appropriate year.

Publications of the years 2025, 2024, 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007

2013

Benchmarking Big Data Systems and the BigData Top100 List. Baru, Chaitanya; Bhandarkar, Milind; Nambiar, Raghunath; Poess, Meikel; Rabl, Tilmann in Big Data (2013). 1(1) 60–64.

[ Abstract ] [ BibTeX ] [ EndNote ] [ Download ]

Variations of the Star Schema Benchmark to Test the Effects of Data Skew on Query Performance. Rabl, Tilmann; Poess, Meikel; Jacobsen, Hans-Arno; O’Neil, Patrick E.; O’Neil, Elizabeth J. (2013). 361–372.

[ Abstract ] [ BibTeX ] [ EndNote ] [ Download ]

A BigBench Implementation in the Hadoop Ecosystem. Chowdhury, Badrul; Rabl, Tilmann; Saadatpanah, Pooya; Du, Jiang; Jacobsen, Hans-Arno (2013). 3–18.

[ Abstract ] [ BibTeX ] [ EndNote ] [ Download ]

BigBench: Towards an Industry Standard Benchmark for Big Data Analytics. Ghazal, Ahmad; Rabl, Tilmann; Hu, Minqing; Raab, Francois; Poess, Meikel; Crolotte, Alain; Jacobsen, Hans-Arno (2013). 1197–1208.

[ Abstract ] [ BibTeX ] [ EndNote ] [ Download ]

@inproceedings{DBLP:conf/sigmod/GhazalRHRPCJ13,
  abstract = {There is a tremendous interest in big data by academia, industry and a large user base. Several commercial and open source providers unleashed a variety of products to support big data storage and processing. As these products mature, there is a need to evaluate and compare the performance of these systems. In this paper, we present BigBench, an end-to-end big data benchmark proposal. The underlying business model of BigBench is a product retailer. The proposal covers a data model and synthetic data generator that addresses the variety, velocity and volume aspects of big data systems containing structured, semi-structured and unstructured data. The structured part of the BigBench data model is adopted from the TPC-DS benchmark, which is enriched with semi-structured and unstructured data components. The semi-structured part captures registered and guest user clicks on the retailer’s website. The unstructured data captures product reviews submitted online. The data generator designed for BigBench provides scalable volumes of raw data based on a scale factor. The BigBench workload is designed around a set of queries against the data model. From a business prospective, the queries cover the different categories of big data analytics proposed by McKinsey. From a technical prospective, the queries are designed to span three different dimensions based on data sources, query processing types and analytic techniques. We illustrate the feasibility of BigBench by implementing it on the Teradata Aster Database. The test includes generating and loading a 200 Gigabyte BigBench data set and testing the workload by executing the BigBench queries (written using Teradata Aster SQL-MR) and reporting their response times.},
  author = {Ghazal, Ahmad and Rabl, Tilmann and Hu, Minqing and Raab, Francois and Poess, Meikel and Crolotte, Alain and Jacobsen, Hans{-}Arno},
  booktitle = {Proceedings of the {ACM} {SIGMOD} International Conference on Management of Data, {SIGMOD} 2013, New York, NY, USA, June 22-27, 2013},
  crossref = {DBLP:conf/sigmod/2013},
  keywords = {SIGMOD},
  pages = {1197-1208},
  title = {BigBench: Towards an Industry Standard Benchmark for Big Data Analytics},
  year = 2013
}

%0 Conference Paper
%1 DBLP:conf/sigmod/GhazalRHRPCJ13
%A Ghazal, Ahmad
%A Rabl, Tilmann
%A Hu, Minqing
%A Raab, Francois
%A Poess, Meikel
%A Crolotte, Alain
%A Jacobsen, Hans-Arno
%B Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, 2013
%D 2013
%P 1197-1208
%T BigBench: Towards an Industry Standard Benchmark for Big Data Analytics
%X There is a tremendous interest in big data by academia, industry and a large user base. Several commercial and open source providers unleashed a variety of products to support big data storage and processing. As these products mature, there is a need to evaluate and compare the performance of these systems. In this paper, we present BigBench, an end-to-end big data benchmark proposal. The underlying business model of BigBench is a product retailer. The proposal covers a data model and synthetic data generator that addresses the variety, velocity and volume aspects of big data systems containing structured, semi-structured and unstructured data. The structured part of the BigBench data model is adopted from the TPC-DS benchmark, which is enriched with semi-structured and unstructured data components. The semi-structured part captures registered and guest user clicks on the retailer’s website. The unstructured data captures product reviews submitted online. The data generator designed for BigBench provides scalable volumes of raw data based on a scale factor. The BigBench workload is designed around a set of queries against the data model. From a business prospective, the queries cover the different categories of big data analytics proposed by McKinsey. From a technical prospective, the queries are designed to span three different dimensions based on data sources, query processing types and analytic techniques. We illustrate the feasibility of BigBench by implementing it on the Teradata Aster Database. The test includes generating and loading a 200 Gigabyte BigBench data set and testing the workload by executing the BigBench queries (written using Teradata Aster SQL-MR) and reporting their response times.

Rapid Development of Data Generators Using Meta Generators in PDGF. Rabl, Tilmann; Poess, Meikel; Danisch, Manuel; Jacobsen, Hans-Arno (2013). 1–6.

[ Abstract ] [ BibTeX ] [ EndNote ] [ Download ]

Poster: MADES - A Multi-Layered, Adaptive, Distributed Event Store. Rabl, Tilmann; Sadoghi, Mohammad; Zhang, Kaiwen; Jacobsen, Hans-Arno (2013). 343–344.

[ Abstract ] [ BibTeX ] [ EndNote ]

Grand Challenge: The Bluebay Soccer Monitoring Engine. Jacobsen, Hans-Arno; Mokhtarian, Kianoosh; Rabl, Tilmann; Sadoghi, Mohammad; Kazemzadeh, Reza Sherafat; Yoon, Young; Zhang, Kaiwen (2013). 295–300.

[ Abstract ] [ BibTeX ] [ EndNote ] [ Download ]

Publications

Chair

News

20.11.2024 | Paper on Ecological Efficiency of Database Servers Accepted at CIDR 2025

09.08.2024 | Paper on Query Compilation for GPUs accepted at LWDA '24

18.07.2024 | Stork paper accepted at DATAI '24

08.03.2024 | CXL Buffer Management Paper Accepted at HardBD & Active '24

01.02.2024 | InferDB paper accepted at VLDB '24

Events

24.03.2022 | FG DB Symposium

Directions