Performance Analysis and Automatic Tuning of Hash Aggregation on GPUs

Rosenfeld, Viktor; Breß, Sebastian; Zeuch, Steffen; Rabl, Tilmann; Markl, Volker in International Workshop on Data Management on New Hardware (DaMoN) ACM , 2019 .

Hash aggregation is an important data processing primitive whichcan be significantly accelerated by modern graphics processors(GPUs). Previous work derived heuristics for GPU-accelerated hashaggregation from the study of a particular GPU. In this paper, weexamine the influence of different execution parameters on GPU-accelerated hash aggregation on four NVIDIA and two AMD GPUsbased on six different microarchitectures. While we are able toreplicate some of the previous results, our main finding is thatoptimal execution parameters are highly GPU-dependent. Mostimportantly, execution parameters optimized for a specific GPU areup to21×slower on other GPUs. Given this hardware dependency,we present an algorithm to optimize execution parameters at run-time. On average, our algorithm converges on a result in less than1% of the time required for a full evaluation of the search space. Inthis time, it finds execution parameters that are at most 1% slowerthan the optimum in 90% of our experiments. In the worst case, ouralgorithm finds execution parameters that are at most1.29×slowerthan the optimum.
