Determining all the different combinations of memory, # of CPUs, storage size, storage performance, network performance, and price, figuring out the right instance sizing for the different layers of an N-tier Java webapp in the cloud, any cloud, is a difficult task. Manually benchmarking all relevant combinations takes a lot of labor and can get expensive. CliQr CloudCenter greatly simplifies this task, or any cloud benchmarking for that matter.
For this investigation, the goal is a set of guidelines for choosing Google Compute Engine instance sizes for an N-tier Java webapp. Not hard and fast rules, mind you, because every N-tier Java Webapp is a little bit different. The goal here is to pare down the number of possible combinations to a manageable number so that you can use the guidelines discovered here as a starting point for testing your specific application.
N-tier Java Webapp Test Parameters
Like previous series of these tests, the popular Spring Framework sample application, PetClinic, was used with an HAProxy load balancer distributing requests between two Tomcat 6 application servers, which then communicated with a single MySQL 5.1 database server. To impart load on each tested installation, a JMeter script by Mark Nolan was retrieved from GitHub and adapted slightly to take advantage of the host IP address substitution feature of the CliQr benchmarking facility as well as changes to the root path expected by the script (the original assumed “/petclinic” instead of simply “/”) and an increased load (the original launches 550 transactions, modified 5500). The resulting .jmx file can be found here.
Again, the goal of this test was to establish guidelines, not hard and fast rules. As with all benchmark tests launched using CliQr CloudCenter, a test node is created and loaded with JMeter in the same region as the application being tested, US Central 1a in this case.
GCE Instance Types Considered
From the published list of available GCE instance types when the tests were performed in late February, 2014, tests started with the standard line of instance types:
GCE Instance Size Test Round 1: Shotgun Approach
The first round of testing sought to discover where the knee in the price-performance curve might be. Load balancers tend to not need much CPU or memory, but under the circumstances network performance might impact the results. Prior N-tier Java webapp experience tells us that the application server layer tends to be fairly even with regard to memory and CPU consumption while the database layer tends to put more of a strain on memory.
With those thoughts in mind, the following table shows the combinations executed in our Round 1:
|N1-standard-1||N1-standard-2||N1-standard-1||Slightly better app layer|
|N1-standard-1||N1-standard-2||N1-standard-2||Match the slightly better app layer with a bigger DB|
|N1-standard-1||N1-standard-4||N1-standard-2||Increase app layer a 2nd time|
|N1-standard-1||N1-standard-4||N1-standard-4||Increase DB layer a 2nd time|
Among the powerful features of CliQr CloudCenter is that the profile for this multi-tiered PetClinic was constructed once and then reused across all 6 of these tests. Once the series of tests were started, their cumulative execution around 45 minutes.
Running these tests manually would take several orders of magnitude longer than this and be subject to potential inconsistencies in the runs. By using CliQr CloudCenter, the consistency of each of the 6 tests executed is guaranteed by the underlying automation and reuse of the same application profile.
The Round 1 table above is color coded to correspond to the graphed price-performance results, where being on the upper left is better:
This first set of tests shows that not much performance is to be gained by adding VM size to the database layer. There is slight degradation between the red and orange test and only minor improvement between the green and purple tests, where the database layer size was increased.
The app server layer, as shown by the delta between the blue and red tests and again between the red and green tests, shows improvement with additional VM size. Surprisingly, adding a larger load balancer also increased performance.
GCE Instance Size Test Round 2: Seeking Precision
With data from the first set of tests in mind, it seems prudent to push the envelope on the size of the app server layer and load balancing layer as well as attempting to zero in on exact memory needs of the application layer by trying the highmem (N1-highmem-8 with 8 virtual CPUs and 52 GB of memory) and highcpu (N1-highcpu-8 with 8 virtual CPUs and 7.20 GB of memory) sizes.
|N1-standard-4||N1-standard-4||N1-standard-4||Best results from Round 1 repeated|
|N1-standard-4||N1-standard-8||N1-standard-1||Increase app server|
|N1-standard-4||N1-highmem-8||N1-standard-1||High mem app|
|N1-standard-4||N1-highcpu-8||N1-standard-1||High cpu app|
Again, all that was required to run these additional tests using CliQr was to reuse the Application Profile from the first six, just with different sized machines for the different layers. The results:
In this set of tests, the results show slightly better performance for the n1-standard-4 for application server and load balancing layers with the n1-standard-1 database (orange) than in the first run of tests. Similarly, there was a slight drop in performance for the n1-standard-4 test for all layers (turquoise) compared to the first run of tests. All other combinations saw similar performance, indicating no gain in performance for more expensive virtual machines.
The one combination worth highlighting, though, is the n1-highcpu-8 application server test (dark blue). It performed similarly to 8 CPU tests with higher memory, suggesting that the load is more CPU bound than memory bound. For even higher loads than the 5500 transactions tested here, that may be a way to get better throughput at a lower cost than alternatives for the application server layer.
The data collected here provides some useful guidance for multi-tiered Java applications on GCE, including:
- Larger VMs at the load balancing layer can make a measurable performance difference for the application as a whole.
- Application servers perform better on larger VMs and lower memory, higher CPU options available with the highcpu family are worth exploring.
- Money can be saved on the database tier, which performed just as well with smaller VM sizes than with larger ones.
As always, it is best to run your exact application on a variety of configurations to get sizing that suits your specific needs. All application workloads vary from others. As this exercise has shown, though, CliQr CloudCenter makesit quick and easy to test different combinations so that you can get the data you need to make the