Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

Michael Noll

I would like to invite you to the guest talk "Kafka in Theory and Practice" given by Michael Noll, a Kafka expert and member of Confluent. As Confluent's product manager for stream processing, he is leading the development of Kafka Streaming since 2015 and supported various industry-scale Kafka projects. With a PhD received in 2010, he is also one of HPI's first alumnis. Hence, we are very happy to have him as a guest speaker. Please find the details of his talk below.

  • Location: G-3.E.15/16
  • Date: 24.04.2019
  • Time: 11 - 12:30 PM

Abstract

Modern businesses have data at their core, and this data is changing continuously. Stream processing is what allows you to harness this torrent of information in real-time, and thousands of companies use Apache Kafka as the de-facto event streaming platform to transform and reshape their industries. Whether you know it or not: many of your daily activities such as shopping online, listening to music, booking a hotel, driving a car, making payments, staying in touch with friends on social networks, and reading a newspaper are powered by Apache Kafka behind the scenes. Example companies include Apple, Netflix, Microsoft, Paypal, Audi, Uber, CERN, New York Times, AirBnB, Etsy, Zalando, Pinterest, and the BBC.

In this talk we will introduce Apache Kafka, a distributed, highly scalable, fault-tolerant platform for event streaming. We discuss use cases as we as some of the internals of Kafka, such as its data and processing model, and how it achieves elasticity, scalability, and fault-tolerance. More specifically, we cover the core of Kafka, i.e. its storage and publish/subscribe layer, as well as its data integration component Kafka Connect, and the processing technologies Kafka Streams (for Java and Scala applications) and KSQL, the streaming SQL engine for Kafka.

Bio

Michael Noll is the product manager for stream processing at Confluent, the company founded by the creators of Apache Kafka. His work is focused on Kafka's Streams API and KSQL, the streaming SQL engine for Kafka. Previously Michael was the technical lead of the Big Data platform of .COM/.NET DNS operator Verisign, where he grew the Hadoop and Kafka based infrastructure from zero to petabyte-sized production clusters spanning multiple data centers – one of the largest Big Data infrastructures operated from Europe at the time. He is a contributor and committer to open source projects such as Apache Kafka and Apache Storm, and writes a well-known blog about big data and distributed systems. On the academic side Michael has a bi-national Ph.D. in computer science from the Hasso Plattner Institute at the University of Potsdam, Germany, and the University of Luxembourg. He has been a frequent speaker at international conferences such as Kafka Summit, Strata Data Conference, and ApacheCon.