Hasso-Plattner-Institut
  
Hasso-Plattner-Institut
Prof. Dr. h.c. Hasso Plattner
  
 

HANA Load Simulator

Motivation

The HANA Load Simulator creates a realistic enterprise workload of thousands of concurrent users and executes that workload on different database configurations simultaneously. A dashboard monitors several performance indicators of each database, incl. data footprint, transaction latencies, throughput, and overall CPU utilization. The dashboard can also be used to configure several workload parameters like OLTP and OLAP query frequencies or the ratio of actual and historical queries. This provides a simple and interactive tool to assess key performance characteristics of different database setups (e.g., single- vs. multi-node) side-by-side and in real-time.

Experiment

We compare a) a single database node with b) a multi-node setup consisting of a master node (actual data only), one replica node of the master for running OLAP transactions, and a cold node for historical data. Both setups have an equal total amount of cores and main memory. The usage of the replica node and aggregate caches can be switched on and off. The workload consists of three types of transactions (ratio configurable): invoice postings (sFIN-adapted), read-only transactions incl. OLTP queries (incl. BKPF-BSEG-joins), and OLAP transactions incl. read-heavy analytical queries.

With the partitioning into actual and historical and replication of the actual data, we see the following improvements (90% actual-only OLAP transactions, 100% actual-only OLTP transactions, one of 100 queries being analytical):

Improved performance

  • Transactional processing is improved even without the use of a replica due to the smaller data set. Activating the replica, the multi-node setup is faster by a factor of ~4 for mixed workloads.

  • The higher the skew is towards an actual-only workload, the more the new architecture outperforms the traditional setup.

  • When adding analytical users to the system, a replica of the actual master node significantly lowers the latency of OLTP transactions due to better load distribution.

Reduced costs

  • Historical data can be purged and better compressed to decrease the memory footprint and require less main memory.

  • Overall system costs decrease as smaller servers can be deployed, hence avoiding disproportional prices for large server systems.

Live on stage at SAPphire NOW 2015 in Orlando.

Thousands of SAP customers in the fully occupied Orange County Convention Center in Orlando, FL and watchers of the live stream saw the impact of actual / historical data optimization for SAP HANA in terms of database performance and system load.

Martin Boissier and Carsten Meyer had the chance to present the master project -HANA Load Simulator- (Daniel Kurzynski, Rui Ruhrlaender, Christopher Schmidt, Jannik Marten, Jan-Peer Rudolph, Alexander Franke, Jasper Schulz, and Pedro Flemming) live on stage during Prof. Hasso Plattner's keynote speech. 

 

 

Desirable: Seeing and comparing the impact of fundamental system changes, helps to understand the meaning of those changes and the true value behind them. Changing simulation parameter and getting direct feedback allows to explore system behavior and to consider the odds of various options.

Feasible: HANA features read-only replication, A/H data partitioning and aggregate caching (in early versions) already today. Workload and data set can be generated and adapted close to a productive load and deployed to different hardware setups.

Viable: A technical description is less convincing than a running system. If people see positive effects of a system setup they are more willing to test it. If people can evaluate the impact of changes in their own environment, they are more willing to buy it.

Vision

The HANA Load Simulator shall visualize the impact of A/H partitioning, read-only replication, and aggregates caching on a customer's productive system. Also the possibility of increased reliability (high availability) using hot-standby replicas will be shown in future versions. With access to a customer’s production data and a corresponding workload trace, the simulator can mimic the real production system in order to show the feasibility and benefits of the mentioned concepts on different hardware setups.