Our Paper "Scale-Down Experiments on TPCx-HS" by Maximilian Böther and Tilmann Rabl was accepted at the Big Data in Emergent Distributed Environments (BiDEDE) Workshop at SIGMOD 2021!
The Transaction Processing Performance Council's (TPC) benchmarks are the standard for evaluating data processing performance and are extensively used in academia and industry.
Official TPC results are usually produced on high-end deployments, making transferability to commodity hardware difficult. Recent performance improvements on low-power ARM CPUs have made low-end computers, such as the Raspberry Pi, a candidate platform for distributed, low-scale data processing.
In this paper, we conduct a feasibility study of executing scaled-down big data workloads on low-power ARM clusters. To this end, we run the TPCx-HS benchmark on two Raspberry Pi clusters. TPCx-HS is the ideal candidate for hardware comparisons and understanding hardware characteristics for data processing workloads, because TPCx-HS results do not depend on specific software implementations and the benchmark has limited options for workload-specific tuning. Our evaluation shows that Pis exhibit similar behavior to large-scale big data systems in terms of price performance and relative throughput to performance results. Current generation Pi clusters are becoming a reasonable choice for GB-scale data processing due to the increasing amount of available memory, while older versions struggle with stable execution of high-load scenarios.