Publications

We try to keep an up to date list of all our publications. If you are interested in a PDF that we have not uploaded yet, feel free to send us an email to get a copy. All recent publications you will find below. For older, please click appropriate year.

Publications of the years 2025, 2024, 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007

Springer LNCS

{ "authors" : [{ "lastname":"Lastname" , "initial":"F" , "url":"http://www.example.com" , "mail":"example(at)example.com" }]}

2022

Benson, L., Papke, L., Rabl, T.: PerMA-Bench: Benchmarking Persistent Memory Access. Proceedings of the VLDB Endowment. 15, 2463–2476 (2022).

[ Abstract ] [ BibTeX ] [ URL ] [ DOI ] [ Download ]

Gévay, G.E., Rabl, T., Breß, S., Madai-Tahy, L., Quiané-Ruiz, J.-A., Markl, V.: Imperative or Functional Control Flow Handling: Why not the Best of Both Worlds?. ACM SIGMOD Record. 51, 1–8 (2022).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Strassenburg, N., Tolovski, I., Rabl, T.: Efficiently Managing Deep Learning Models in a Distributed Environment. 25th International Conference on Extending Database Technology (EDBT ’22) (2022).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Maltenberger, T., Lehmann, T., Benson, L., Rabl, T.: Evaluating In-Memory Hash-Joins on Persistent Memory. 25th International Conference on Extending Database Technology (EDBT ’22) (2022).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Lutz, C., Breß, S., Zeuch, S., Rabl, T., Markl, V.: Triton Join: Efficiently Scaling the Operator State on GPUs with Fast Interconnects. ACM SIGMOD International Conference on Management of Data (SIGMOD ’22) (2022).

[ Abstract ] [ BibTeX ] [ Download ]

Del Monte, B., Zeuch, S., Rabl, T., Markl, V.: Rethinking Stateful Stream Processing with RDMA. ACM SIGMOD International Conference on Management of Data (SIGMOD ’22) (2022).

[ Abstract ] [ BibTeX ] [ Download ]

Maltenberger, T., Ilic, I., Tolovski, I., Rabl, T.: Evaluating Multi-GPU Sorting with Modern Interconnects. 2022 ACM SIGMOD International Conference on Management of Data (SIGMOD ’22) (2022).

[ Abstract ] [ BibTeX ] [ DOI ] [ Download ]

@inproceedings{maltenberger2022evaluating,
  abstract = {In recent years, GPUs have become a mainstream accelerator for database operations such as sorting. Most of the published GPU- based sorting algorithms are single-GPU approaches. Consequently, they neither harness the full computational power nor exploit the high-bandwidth P2P interconnects of modern multi-GPU platforms. In particular, the latest NVLink 2.0 and NVLink 3.0-based NVSwitch interconnects promise unparalleled multi-GPU acceleration. Re- garding multi-GPU sorting, there are two types of algorithms: GPU- only approaches, utilizing P2P interconnects, and heterogeneous strategies that employ the CPU and the GPUs. So far, both types have been evaluated at a time when PCIe 3.0 was state-of-the-art. In this paper, we conduct an extensive analysis of serial, parallel, and bidirectional data transfer rates to, from, and between multiple GPUs on systems with PCIe 3.0, PCIe 4.0, NVLink 2.0, and NVLink 3.0-based NVSwitch interconnects. We measure up to 35.3× higher parallel P2P copy throughput with NVLink 3.0-powered NVSwitch over PCIe 3.0 interconnects. To study multi-GPU sorting on today’s hardware, we implement a P2P-based (P2P sort) and a heteroge- neous (HET sort) multi-GPU sorting algorithm and evaluate them on three modern systems. We observe speedups over state-of-the- art parallel CPU-based radix sort of up to 14× for P2P sort and 9× for HET sort. On systems with high-speed P2P interconnects, we demonstrate that P2P sort outperforms HET sort by up to 1.65×. Finally, we show that overlapping GPU copy and compute opera- tions to mitigate the transfer bottleneck does not yield performance improvements on modern multi-GPU platforms.},
  author = {Maltenberger, Tobias and Ilic, Ivan and Tolovski, Ilin and Rabl, Tilmann},
  booktitle = {2022 ACM SIGMOD International Conference on Management of Data (SIGMOD ’22)},
  keywords = {evaluation gpu interconnect nvlink pcie sigmod sorting},
  title = {Evaluating Multi-GPU Sorting with Modern Interconnects},
  year = 2022
}

Damme, P., Birkenbach, M., Bitsakos, C., Boehm, M., Bonnet, P., Ciorba, F., Dokter, M., Dowgiallo, P., Eleliemy, A., Faerber, C., Goumas, G., Habich, D., Hedam, N., Hofer, M., Huang, W., Innerebner, K., Karakostas, V., Kern, R., Kosar, T., Krause, A., Krems, D., Laber, A., Lehner, W., Mier, E., Rabl, T., Ratuszniak, P., Silva, P., Skuppin, N., Starzacher, A., Steinwender, B., Tolovski, I., Tözün, P., Ulatowski, W., Wang, Y., Wrosz, I., Zamuda, A., Zhang, C., Xiang Zhu, X.: DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines. 12th Annual Conference on Innovative Data Systems Research (CIDR ’22) (2022).

[ Abstract ] [ BibTeX ] [ Download ]

@inproceedings{damme2022daphne,
  abstract = {Integrated data analysis (IDA) pipelines—that combine data management (DM) and query processing, high-performance computing (HPC), and machine learning (ML) training and scoring—become increasingly common in practice. Interestingly, systems of these areas share many compilation and runtime techniques, and the used—increasingly heterogeneous—hardware infrastructure converges as well. Yet, the programming paradigms, cluster resource management, data formats and representations, as well as execution strategies differ substantially. DAPHNE is an open and extensible system infrastructure for such IDA pipelines, including language abstractions, compilation and runtime techniques, multi-level scheduling, hardware (HW) accelerators, and computational storage for increasing productivity and eliminating unnecessary overheads. In this paper, we make a case for IDA pipelines, describe the overall DAPHNE system architecture, its key components, and the design of a vectorized execution engine for computational storage, HW accelerators, as well as local and distributed operations. Preliminary experiments that compare DAPHNE with MonetDB, Pandas, DuckDB, and TensorFlow show promising results.},
  author = {Damme, Patrick and Birkenbach, Marius and Bitsakos, Constatinos and Boehm, Matthias and Bonnet, Philippe and Ciorba, Florina and Dokter, Mark and Dowgiallo, Pawel and Eleliemy, Ahmed and Faerber, Christian and Goumas, Georgios and Habich, Dirk and Hedam, Niclas and Hofer, Marlies and Huang, Wenjun and Innerebner, Kevin and Karakostas, Vasileios and Kern, Roman and Kosar, Tomaž and Krause, Alexander and Krems, Daniel and Laber, Andreas and Lehner, Wolfgang and Mier, Eric and Rabl, Tilmann and Ratuszniak, Piotr and Silva, Pedro and Skuppin, Nikolai and Starzacher, Andreas and Steinwender, Benjamin and Tolovski, Ilin and Tözün, Pinar and Ulatowski, Wojciech and Wang, Yuanyuan and Wrosz, Izajasz and Zamuda, Aleš and Zhang, Ce and Xiang Zhu, Xiao},
  booktitle = {12th Annual Conference on Innovative Data Systems Research (CIDR ’22)},
  keywords = {cidr daphne dataanalysis pipelines},
  title = {DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines},
  year = 2022
}

Benson, L., Rabl, T.: Darwin: Scale-In Stream Processing. 12th Annual Conference on Innovative Data Systems Research (CIDR ’22) (2022).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

@inproceedings{benson_darwin_2022,
  abstract = {Companies increasingly rely on stream processing engines (SPEs) to quickly analyze data and monitor infrastructure. These systems enable continuous querying of data at high rates. Current production-level systems, such as Apache Flink and Spark, rely on clusters of servers to scale out processing capacity. Yet, these scale-out systems are resource inefficient and cannot fully utilize the hardware. As a solution, hardware-optimized, single-server, scale-up SPEs were developed. To get the best performance, they neglect essential features for industry adoption, such as larger-than-memory state and recovery. This requires users to choose between high performance or system availability. While some streaming workloads can afford to lose or reprocess large amounts of data, others cannot, forcing them to accept lower performance. Users also face a large performance drop once their workloads slightly exceed a single server and force them to use scale-out SPEs. To acknowledge that real-world stream processing setups have drastically varying performance and availability requirements, we propose scale-in processing. Scale-in processing is a new paradigm that adapts to various application demands by achieving high hardware utilization on a wide range of single- and multi-node hardware setups, reducing overall infrastructure requirements. In contrast to scaling-up or -out, it focuses on fully utilizing the given hardware instead of demanding more or ever-larger servers. We present Darwin, our scale-in SPE prototype that tailors its execution towards arbitrary target environments through compiling stream processing queries while recoverable larger-than-memory state management. Early results show that Darwin achieves an order of magnitude speed-up over current scale-out systems and matches processing rates of scale-up systems.},
  author = {Benson, Lawrence and Rabl, Tilmann},
  booktitle = {12th Annual Conference on Innovative Data Systems Research (CIDR ’22)},
  keywords = {cidr myown streamprocessing},
  title = {Darwin: Scale-In Stream Processing},
  year = 2022
}

Publications

Chair

News

20.11.2024 | Paper on Ecological Efficiency of Database Servers Accepted at CIDR 2025

09.08.2024 | Paper on Query Compilation for GPUs accepted at LWDA '24

18.07.2024 | Stork paper accepted at DATAI '24

08.03.2024 | CXL Buffer Management Paper Accepted at HardBD & Active '24

01.02.2024 | InferDB paper accepted at VLDB '24

Events

24.03.2022 | FG DB Symposium

Directions