Hasso-Plattner-Institut
Prof. Dr. h.c. mult. Hasso Plattner
 

Thomas Bodner

Research Assistant, PhD Student

Email: thomas.bodner(at)hpi.de
Phone: +49 (331) 5509 - 3934
Address: August-Bebel-Str. 88, 14482 Potsdam
Office: Campus II, F-1.06
Office Hours: Just stop by or mail/call ahead for an appointment
Profiles: DBLP, Google ScholarResearchGate, GitHubLinkedIn

>> I am now a member of the HPI DES group. You can find my up-to-date profile here.


Research Area: Autonomous Data Management

I am a Computer Science PhD student in the Data Engineering Systems Group at HPI, co-advised by Tilmann Rabl and Hasso Plattner. The goal of my research is to make data systems cheaper and faster through the unique capabilities of modern cloud environments. I am particularly interested in all aspects around query processing on serverless cloud infrastructure. Before joining HPI, I built database systems at SAP, TU Berlin, UC Irvine, and IBM.

Research

Elastic Query Processing on Serverless Cloud Infrastructure

Enterprises increasingly run the applications supporting their business processes in the cloud. Application data residing in the cloud expand the importance of cloud-based analytical workloads, which require provisioned infrastructure before any query processing can begin. Resource provisioning can be difficult for these workloads because they are often unpredictable and ad-hoc in nature. Overprovisioning and reduced cost-efficiency are the norm to avoid disruption of performance due to insufficient resources.

Recently, cloud providers have introduced means to allocate and bill fine-granular units of resources with function-as-a-service (FaaS) compute platforms and shared object storage systems. We evaluate this so-called serverless infrastructure regarding its performance elasticity and variability. Based on our findings, we build the Skyrise serverless query processor that interhits the elastic scalability of its underlying FaaS infrastructure while it deals with the limitations and inefficiencies. Skyrise enables cost-efficient, interactive analytics on infrequently accessed data, a workload for which conventionally provisioned database systems are idle most of the time and thus not viable.

Teaching

Lectures:

  • Trends and Concepts in the Software Industry I (2021, 202020192018)

Seminars:

  • Joint Database Systems Seminar with TU Darmstadt (2022)
  • Research and Implementation of Database Concepts (2022, 2021, 2020)
  • Develop your own Database (2022, 2019)
  • Trends and Concepts in the Software Industry II (2020)

Bachelor's Projects:

Master's Projects:

  • Building an Elastic Query Engine on Serverless Cloud Infrastructure (2021)
  • Performance Engineering for Cloud-based Database Systems (2020)

Master's Theses:

  • Cost-aware Pruning with Filters in Serverless Data Management (Timon Millich, 2022)
  • Serverless Maintenance of Database Statistics and Cached Query Results (Pascal Schulze, 2022)
  • Cost-efficiency and Robustness in Serverless Join Processing (David Justen, 2021)
  • Query Compilation for Distributed Execution with Cloud Functions (Julian Menzler, 2021)
  • Straggler Mitigation in Distributed Query Execution on Cloud Functions (Fabian Engel, 2021)
  • Elastic Query Execution via Short-lived and Stateless Cloud Functions (Jan Mensch, Universität Potsdam, 2020)
  • Network Request Handling in Database Systems (Toni Stachewicz, 2019)
  • Data-dependent Implicit Authorizations for Fine-grained Database Access Control (Dennis Hempfing, 2018)
  • Pushing Down User-defined Functionality in Distributed Log-centric Big Data Stacks (Josephine Rückert, Technische Universität Ilmenau, 2017)

Selected Talks & Presentations

  • The Vora Big Data Management System, Lecture at Universität Augsburg, July 2015
  • A Taxonomy of Platforms for Analytics on Big Data, Talk Series at UC San Diego, UC Irvine, IBM Research, HP Labs, Stanford University, and Oracle Labs, October 2012
  • The Stratosphere Parallel Analysis Framework, Poster Talk at HPTS 2011, October 2011

Patents

Berg, G., Wenz, A., Hoeppner, B., Bodner, T., Cherepanova, O., Steffen, L., Siebert, J., Hennemann, D., Schulze, P., Dobler, K., Kahl, K., Beneke, P., Hoberg, P.:
Generation of Bots Based on Observed Behavior
US 2020

Bumbulis, P., Pound, J., Auch, N., Goel, A., Ringwald, M., Bodner, T., MacLean, S.:
An Algorithm for Consistent Replication of Log-Structured Data
EU 2017 and US 2017

Publications

  • 1.
    Bodner, T., Pietz, T., Bollmeier, L.J., Ritter, D.: Doppler: Understanding Serverless Query Execution Proceedings of the SIGMOD Workshop on Big Data in Emergent Distributed Environments (2022)
     
  • 2.
    Bodner, T.: Elastic Query Processing on Function as a Service Platforms Proceedings of the VLDB PhD Workshop (2020)
     
  • 3.
    Goel, A., Pound, J., Auch, N., Bumbulis, P., MacLean, S., Färber, F., Gropengiesser, F., Mathis, C., Bodner, T., Lehner, W.: Towards Scalable Real-time Analytics: An Architecture for Scale-out of OLXP Workloads Proceedings of the VLDB Endowment (2015)
     
  • 4.
    Alexandrov, A., Schiefer, B., Poelman, J., Ewen, S., Bodner, T., Markl, V.: Myriad: Parallel Data Generation on Shared-nothing Architectures Proceedings of the PACT Workshop on Architectures and Systems for Big Data (2011)