HOME
> HPI DSE
> Weekly Seminar

Weekly Seminar

The Research School "Data Science and Engineering" meets once a week in person for presentations, workshops and discussions.

Sessions will be held on Wednesdays at 3:15 p.m. from October 16th 2024 through February 5th 2025.

Weekly Seminar | Winter Semester 2024/25

23.10.2024 | How To: Research Poster (Ferdous Nasri)

Ferdous Nasri

Title: How To: Research Poster

Abstract: Presenting research effectively is a vital skill in science and academia, and conference posters are a great medium for sharing your work with a broad audience. In this interactive talk, we will explore the do’s and don’ts of designing and presenting research posters. We’ll cover best practices for layout, content, and visuals, as well as common pitfalls to avoid. Since standards vary across different fields, we encourage participants to share their experiences and insights.

30.10.2024 | Writing a Research Plan and Setting Goals (Prof. Dr. Bert Arnrich)

Moved to Research School Talk

06.11.2024 | The Impact of Data Quality on ML (Felix Neutatz)

Felix Neutatz

Title: The Impact of Data Quality on ML

Abstract: Data cleaning is widely acknowledged as an important yet tedious task when dealing with large amounts of data. Thus, there is always a cost-benefit trade-off to consider. In particular, it is important to assess this trade-off when not every data point and data error is equally important for a task. This is often the case when statistical analysis or machine learning (ML) models derive knowledge about data. If we only care about maximizing the utility score of the applications, such as accuracy or F1 scores, many tasks can afford some degree of data quality problems. Recent studies analyzed the impact of various data error types on vanilla ML tasks, showing that missing values and outliers significantly impact the outcome of such models. In this paper, we expand the setting to one where data cleaning is not considered in isolation but as an equal parameter among many other hyper-parameters that influence feature selection, regularization, and model selection. In particular, we use state-of-the-art AutoML frameworks to automatically learn the parameters that benefit a particular ML binary classification task. In our study, we see that specific cleaning routines still play a significant role but can also be entirely avoided if the choice of a specific model or the filtering of specific features diminishes the overall impact.

13.11.2024 | Workshop on Design Research vs Computer Science Research (Min Deng)

Min Deng (HCI Lab)

Title: Workshop on Design Research vs Computer Science Research

Abstract: During a discussion at our lab, we discovered that engineers, theoretical computer scientists, and designers approach their academic work in extremely different ways: For example, where theoretical computer science papers require rigorous proofs of correctness, while design and engineering papers need to show real-world impact. In this session, we will look into how different research cultures approach their work to hopefully learn something from another community. During the session, we will look at the _structure_ and _contribution_ of papers from different communities (Computer Science, HCI, (actual) architecture) in small groups, present our findings, and discuss what we can learn from the differences.

20.11.2024 | Designing Experiments (Professor Dr. Bernhard Renard )

Moved to Research School Talk

27.11.2024 | Top Publication Venues in Our Field

Talk by distinguished professors or their representatives from various departments at HPI

Title: Top Publication Venues in Our Field

Abstract: Distinguished professors or their representatives from various departments at HPI will deliver five-minute presentations, offering valuable insights into the leading publication venues within their respective fields. Their presentations will cover important aspects of these publications, including core research focus, types of accepted submissions, peer-review procedures, and additional information about their location, size, programming structure, and historical background.

04.12.2024 | Overview of New Research Group Algorithmic Decision Making and Society (Prof. Dr. Niclas Boehmer)

Prof. Dr. Niclas Boehmer

Title: Overview of New Research Group Algorithmic Decision Making and Society

Abstract:

This talk will offer a mostly non-technical overview of the work of the Algorithmic Decision Making and Society group. I will discuss some of my previous work, as well as our current projects and future plans. Specifically, I will share some insights on:

the design and usage of distribution algorithms for allocating scarce support resources,
bridging theoretical insights from voting theory with practical applications to develop better participation algorithms, and
enhancing the transparency of decision making systems by analyzing the robustness of decisions and providing counterfactual explanations.

11.12.2024 | Tools and Tricks (Martin Boissier)

Martin Boissier

Title: Tools and Tricks

Abstract: In this cluster talk on Wednesday 2024-12-11, we want to collaboratively collect tools, tricks, and hacks that make our lives as researchers easier (or at least more endurable). We plan to collect one-sliders from every cluster member. Each member will then get one minute to present her/his slide. In the last iteration, members covered content as Latex hacks, ways to improve fonts in Matplotlib, or how to avoid HPI’s VPN client. We welcome a very broad set of tools and tricks.

18.12.2024 | No Cluster Talk | Moved to Lectures for PhD Students

08.01.2025| Diffusion Models for Inverse Problems: From Complex Systems to Single Cells (Prof. Dr. Stephan Mandt)

Prof. Dr. Stephan Mandt

Title: Diffusion Models for Inverse Problems: From Complex Systems to Single Cells

Abstract: Inverse problems, which involve reconstructing underlying phenomena from incomplete or noisy observations, are central to many scientific disciplines. In this talk, I will explore how diffusion models provide a principled and flexible framework for addressing these challenges across a range of domains, from complex systems like weather modeling to image restoration tasks. I will outline the construction of diffusion models and introduce efficient sampling strategies, emphasizing the role of conjugate integrators. Specific applications include video super-resolution for climate science, where diffusion models enable high-resolution, temporally coherent predictions from coarse guidance signals. These approaches hold significant promise for advancing applications in biology, particularly in areas requiring the precise reconstruction, integration, and generation of multimodal data.

15.01.2025 | Making data more accessible with LLMs (Gerardo Vitagliano)

Gerardo Vitagliano

Title: Making data more accessible with LLMs

Abstract: LLMs and other transformer-based applications have proven effective at embedding the semantic of unstructured data objects, as proven by their success in natural language understanding, image captioning, and information extraction. However, employing these models for unstructured data management still has open challenges: the non-deterministic nature of generative architectures, the cost and scalability of large models, the unclear programming paradigm in building complex pipelines. This talk will introduce some of our work to overcome these challenges.

First, we will discuss Palimpzest, a declarative system to build multi-step pipelines that can process large datasets with LLMs. Palimpzest allows users to specify logical pipelines, and automatically finds physical implementations of these pipelines that optimize their cost and runtime.
Then, we will introduce the Caravaggio system, a semantic question answering system that can process multimodal datasets. Caravaggio overcomes some of the limitations of state-of-the-art RAG solutions by dynamically featurizing unstructured data at several abstraction levels.
We will conclude with an outlook into the open challenges and opportunities to make data (management) more accessible in the era of transformer-based models.

22.01.2025 | Artificial Intelligence and Planetary Boundaries - Using Machine Learning to Tackle Climate and Sustainability Challenges (Dr. Marcus Voß)

Dr. Marcus Voß

Title: Artificial Intelligence and Planetary Boundaries - Using Machine Learning to Tackle Climate and Sustainability Challenges

Abstract: Artificial intelligence (AI) and machine learning (ML) are powerful tools for advancing environmental sustainability within planetary boundaries. This talk will explore the potential of AI to drive effective climate action, with examples from industry applications (e.g. Deutsche Bahn, Berlin Recycling) and academia. Attendees will gain insight into current trends, recurring challenges, and the societal and environmental considerations of deploying AI systems. Through examples and practical resources, this session aims to inspire actionable ideas for researchers and practitioners working at the intersection of AI and sustainability.

29.01.2025 | Giving Good Research Talks: Slide Structure, Content, Preparation, Style, Q&A | Prof. Dr. Ariel Stern

Moved to Research School Talk

05.02.2025 | General Discussion and Feedback session for this sementer

Weekly Seminar

Weekly Seminar | Winter Semester 2024/25

23.10.2024 | How To: Research Poster (Ferdous Nasri)

30.10.2024 | Writing a Research Plan and Setting Goals (Prof. Dr. Bert Arnrich)

06.11.2024 | The Impact of Data Quality on ML (Felix Neutatz)

13.11.2024 | Workshop on Design Research vs Computer Science Research (Min Deng)

20.11.2024 | Designing Experiments (Professor Dr. Bernhard Renard )

27.11.2024 | Top Publication Venues in Our Field

04.12.2024 | Overview of New Research Group Algorithmic Decision Making and Society (Prof. Dr. Niclas Boehmer)

11.12.2024 | Tools and Tricks (Martin Boissier)

18.12.2024 | No Cluster Talk | Moved to Lectures for PhD Students

08.01.2025| Diffusion Models for Inverse Problems: From Complex Systems to Single Cells (Prof. Dr. Stephan Mandt)

15.01.2025 | Making data more accessible with LLMs (Gerardo Vitagliano)

22.01.2025 | Artificial Intelligence and Planetary Boundaries - Using Machine Learning to Tackle Climate and Sustainability Challenges (Dr. Marcus Voß)

29.01.2025 | Giving Good Research Talks: Slide Structure, Content, Preparation, Style, Q&A | Prof. Dr. Ariel Stern

05.02.2025 | General Discussion and Feedback session for this sementer

International Branches

Events

Scholarships

Contact

Previous Semesters