Hasso-Plattner-Institut
Prof. Dr. Tilmann Rabl
 

Immanuel Trummer

Affiliation: Cornell University
Title: CheaPT: Using Language Models without Breaking the Bank

 

Abstract

The past years have been marked by several breakthrough results in the domain of generative AI, culminating in the rise of tools like ChatGPT, able to solve a variety of language-related tasks without specialized training. However, this power comes at a cost, making it challenging to scale up processing with LLMs to large data sets. In this talk, I outline several techniques we used successfully to make LLM inference more efficient. In particular, I discuss ThalamusDB, a system that performs approximate data processing, applying LLMs to carefully selected data subsets for maximal precision. I will also introduce our work on SpareLLM, aimed at helping users select the cheapest model that satisfies user-specified quality constraints when applied, for a given task, to a large data set. Beyond our work on making LLM usage more efficient, I will also touch upon several new use cases for LLMs in data management.

Short CV

Immanuel Trummer is an associate professor of computer science at Cornell University. His research focuses on making data analysis more efficient and more user-friendly. In particular, he studies novel use cases for LLMs in the database area and ways to scale up processing via LLMs to large data sets. His papers were selected for "Best of VLDB", "Best of SIGMOD", and the CACM research highlight award. He received an NSF CAREER grant for his work combining LLMs with databases and multiple Google Faculty Research Awards.