More and more data-intensive enterprise applications are deployed in shared environments, such as the data centers of large enterprises or public cloud infrastructures. One example are Software-as-a-Service (SaaS) offerings, where a service provider operates a cluster of servers to host an enterprise application for many customers. A large fraction of such applications exert a so-called “mixed workload” of transactional and analytical queries on the backend database systems. Because of their read-mostly characteristics, such workloads benefit notably from the performance characteristics of inmemory column databases, making them the database of choice for enterprise SaaS.
To reduce total cost of ownership, SaaS providers try to consolidate multiple customers into each database instance, a technique referred to as multi tenancy. There is a challenging trade-off between low operational cost for the provider and performance as perceived by customers, especially given that enterprise applications are often required to operate within stringent performance bounds. Only so much consolidation can occur without significant impact on responsiveness. For managing this tradeoff, the service provider must address two fundamental challenges, (i) workload modeling and (ii) data placement. The first entails the estimation of (shared) resource consumption in the presence of multi tenancy on a single in-memory database server. The second is in the assignment of tenants to servers in a way that minimizes the number of required servers (and thus cost) based on the assumed workload model. This step also entails replication of tenants for performance and high-availability. This dissertation contributes novel solutions to both problems. Both challenges integrate naturally, a solution to the former being a prerequisite for the latter.
The estimation of the combined resource consumption of multiple tenants on a machine provides an indicator for the “fill level” of a server. The goal is to know in advance whether adding another tenant to a server will result in violating service level objectives. This dissertation proposes an empirical model for predicting how much load an in-memory column database can sustain before query response times in the 99-th percentile exceed one second. The model also captures drops in server capacity incurred when tenants are migrated between servers. The accuracy of the model in predicting response times is greater than 90 % in all practically relevant cases. The ability to accurately predict response times makes it possible to run servers at a higher utilization level, thereby decreasing costs. Given a handle on the “fill levels” of servers, the problem of assigning tenants to servers is addressed. This dissertation introduces the Robust Tenant Placement and Migration Problem (RTP) and makes the case for incremental tenant placement, driven by diurnal variations in user load. Individual tenant replicas are migrated while the tenant remains on-line. This allows for frequent, incremental changes to the assignment of tenants to servers in the cluster, with the goal of running with the minimal number of servers at each point in time. Several novel algorithms for incremental tenant placement are designed and evaluated. Experiments on production log data from one of SAP's on-demand enterprise applications show that incremental placement decreases server cost on an average business day by a factor of ten (measured in Amazon EC2 server hours). More drastic savings can be realized during longer periods of low activity, such as weekends or holidays.
The trade-off between the degree of consolidation, operational cost, and the robustness of placements towards abrupt increases in tenant load is studied in detail. Practical constraints, such as computation time required for computing a new incremental placement and the amount of migration that can be performed in a limited time interval, are also considered. The experimental results can serve as practical advice for administrators faced with the problem of providing fixed performance and availability SLOs while minimizing operational cost. In summary, this dissertation provides the first end-to-end solution for workload management and data placement in hosted database environments with multi-tenant architectures.