Hasso-Plattner-Institut
Prof. Dr. Tilmann Rabl
 

Anja Gruenheid

Affiliation: Gray Systems Lab
Title: System Benchmarking And Why It’s Important - An Industry Perspective

 

Abstract

Benchmarking is essential for evaluating system performance and functionality in both academic and industrial settings. This talk presents LST-Bench, a general-purpose, open-source benchmarking framework for log-structured tables (LSTs) that supports flexible workload modeling, robust performance evaluation, and seamless integration with cloud-native systems. To demonstrate the practical value of comprehensive benchmarking with LST-Bench, we examine the issue of small file proliferation in LinkedIn's data lake. This case study illustrates the importance of making systematic benchmarking a foundational practice—rather than an afterthought—in order to address the growing complexity of modern data systems and to ensure their correctness, performance, and maintainability at scale.

Short CV

Anja Gruenheid is a research scientist at the Gray Systems Lab (GSL) at Microsoft, a research team affiliated with Azure Data. Recently, she has been working on understanding, benchmarking, and enhancing large-scale data lake systems. Before joining Microsoft, Anja worked at Google, where her implementation of a benchmarking framework earned a Best Industrial Paper Award from VLDB. She holds a PhD from ETH Zurich.