Hasso-Plattner-Institut
Prof. Dr. Tilmann Rabl
  
 

Publications

We try to keep an up to date list of all our publications. If you are interested in a PDF that we have not uploaded yet, feel free to send us an email to get a copy. All recent publications you will find below. For older, please click appropriate year.

Publications of the years 2020, 2019, 2018, 20172016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007

Efficient and Scalable k-Means on GPUs

Lutz, Clemens; Breß, Sebastian; Rabl, Tilmann; Zeuch, Steffen; Markl, Volker in Datenbank-Spektrum 2018 .

k-Means is a versatile clustering algorithm widely used in practice. To cluster large data sets, state-of-the-art implementations use GPUs to shorten the data to knowledgetime. These implementations commonly assign points on a GPU and update centroids on a CPU. We identify two main shortcomings of this approach. First, it requires expensive data exchange between processors when switching between the two processing steps point assignment and centroid update. Second, even when processing both steps of k-means on the same processor, points still need to be read two times within an iteration, leading to inefficient use of memory bandwidth. In this paper, we present a novel approach for centroid update that allows us to efficiently process both phases of k-means on GPUs. We fuse point assignment and centroid update to execute one iteration with a single pass over thepoints. Our evaluation shows that our k-means approach scales to very large data sets. Overall, we achieve up to 20×higher throughput compared to the state-of-the-art approach.
Efficient and Scalable k-... - Download
Further Information
Tags DatenbankSpektrum