Setting the Standard—Benchmarking Price-Performance on the Cloud2 Comments
With an increased focus on exploiting a wider variety of business applications on the cloud and a broader choice of available cloud providers, enterprises need to focus on moving applications to the right cloud—not just any cloud or multi-cloud. Such a decision is often driven by factors that include the underlying cloud infrastructure’s capabilities, metrics such as availability and API reliability on the cloud, and compliance conditions including geographic location and security standards.
While these are important, a key metric towards this decision-making is the application’s price and performance across different cloud providers and cloud instance types. While the driving motivator to adopt clouds is often increased performance, scalability and cost-savings, an application’s price and performance on different clouds are the only true measure for evaluating the cause and affect of selecting the right cloud. Benchmarking clouds cannot therefore be a simple mathematical spreadsheet exercise. Any cloud benchmarking must include key price, performance and other application-centric metrics actually derived from the application being deployed and managed to determine the “RIGHT” cloud for a given application.
Every cloud is built, sized and priced very differently, which means that application price and performance varies greatly on different clouds and different configurations within each cloud. Price-performance also varies by different application type, architecture, behavior and usage characteristics. The fact is, despite the market noise, until recently, the ability to easily and simultaneously benchmark price and performance of applications across disparate cloud environments did not exist.
Cloud infrastructures today do not provide application level SLAs. Any capabilities, performance and price data is purely limited to infrastructure components such as VMs, storage, and networking. These do not translate directly to application price and performance.
Different clouds have very different underlying physical infrastructure components such as CPU, network backbone, and storage types as well as different virtualization stacks. Moreover, clouds are themselves, variable environments with significant variance in load over time. Different virtualization management including variations in VM placement policies may mean added differences in performance, not just between clouds, but also over time, within the same cloud. In the absence of transparency around VM instance and policies, it is not possible to accurately determine the differences in application performance on different clouds without migrating an application and testing the application performance on each cloud option.
Moreover, cloud instances are “packaged” and priced very differently as well. Given the above lack of transparency about cloud instances and physical backbone, an apples-to-apples comparison based on infrastructure alone is not possible. For example, what is a “small” instance type on one cloud is rarely the same as a “small” instance type on another cloud— will the vCPU’s on both provide the same performance—or will an equivalently priced “medium” instance on yet another cloud provide a overall better price-performance trade-off? Or maybe it is network performance, not CPU that matters for a particular application. Also, rolling up all the different cloud costs to estimate application costs is not straightforward as cost, performance and instance definition and configuration are inextricably linked. Understanding this and these dependent variables is what is required to understand application performance, and because of the cloud’s utility-based pricing model, better application performance may mean fewer infrastructure resources needed and hence lower pay-per-use costs. It is this type of empirical benchmarking that is required to make informed decisions on where to deploy an application on the cloud.
Given all this, a plain infrastructure-to-infrastructure comparison is not an effective means to benchmark clouds for application price-performance. As an example, consider a multi-tier web application with a highly transactional database component and with high I/O requirements between the application server and the database tier. Additionally, the application tier may be elastically scalable. A useful performance metric for such an application may be the number of requests it can handle per second while a useful cost-metric would be the total costs of all tiers combined including storage, compute and network costs. Moreover, one may want to test these metrics for different load settings to see how they change as the application scales. A cloud with a high I/O network backbone, an SSD instance type for the database tier and low VM spin-up times may provide better performance for such an application but at a high cost while a different cloud with “standard” options but lower load might provide not too degraded a performance at lower costs for a better overall tradeoff.
As a different example, consider a highly compute-intensive gene-sequencing application where gene-sequencing jobs may be processed by an elastic cluster. A useful performance metric for such an application may be the time to complete a gene-sequencing job while a useful cost-metric would be the total pay-per-run job cost.
Accordingly, here are four examples of real-world applications—each of a different architecture type and infrastructure needs. While benchmarks can be done against any Public or Private clouds, for this study, these applications were benchmarked across following clouds with different configurations in terms of instance types and cluster size on each:
- HP–HPCS standard.small and standard.2xlarge configuration.
- Amazon–AWS m1.medium and m3.2xlarge configuration.
- Google–GCE n1-standard-1 and n1-standard-8 configuration.
The findings of benchmark study are described below with each application type. The charts on the left show application price on the x-axis and performance on the y-axis. The performance criteria can be throughput (number of requests per second) or the total time to complete a workload. The charts on the right show a price-performance index, a single normalized metric to see which cloud and configuration option provides the best “bang for your buck”.
Chart #1: Benchmark for three-tier Java Web Application with each tier running on a separate VM.
Chart #2: Benchmark for compute-intensive application run in parallel on a cluster.
Chart #3: Benchmark results for Hadoop job running on four nodes.
Chart #4: Benchmark results for high performance cluster computing job.
To summarize, the benchmark results for four different applications had following results as recommended cloud based on app price-performance trade off. Clearly, there is no single cloud instance that performs best for all types of applications.
|Application Type||Medium Configuration||Extra Large Configuration||Recommended Cloud|
|Java Web App||Cloud C||Cloud B||Cloud C Medium Config|
|Parallel Processing job||Cloud C||Cloud B||Cloud C Medium with More Nodes|
|Hadoop App||Cloud A||Cloud A||Cloud A Extra Large|
|High Performance Cluster Computing Job||Cloud A/Cloud B||Cloud B||Cloud B Medium with More Nodes|
As may be clear from such examples, real-world complex enterprise applications need more than a simple spreadsheet-based back-of-the-envelope cost-estimate and infrastructure based performance analysis.
No wonder that many enterprises today find themselves having migrated to a cloud environment only to discover significant variations in spending and performance than estimated.
Let’s get back to what matters—finding the right cloud, and yes, clouds do indeed matter. For many reasons, application price and performance in different cloud environments vary greatly. What’s needed is an efficient way to find the right cloud for the application and continue to ensure complete portability so that the application can continue to move to the right cloud, with no additional migration—based on latest performance and price changes across clouds.