Over the years there have been countless times when I have been asked to offer performance tuning advice to Java programmers. Given the application they are developing, folks want to know how they can ensure meeting the application’s performance needs, as opposed to learning how to do Java performance tuning. In this article, I offer my advice on how to be successful at meeting an application’s performance expectations.
Many times it is the case that when a Java application is developed, or an existing one is enhanced with new feature capability, it is not able to meet performance expectations of those who use the application or have some stake in the application’s performance. There are cases where post development activities such as extensive JVM tuning or application performance tuning efforts are able to meet performance needs. However, these fire drills can be mitigated through proactive activities prior to or even during the implementation phase of the application or the enhancement.
In the following sections, you will learn how to avoid those last minute performance tuning fire drills.
Importance of Performance Requirements
For every Java application or for any enhancement to an existing Java application, there are always some front-end requirements that need to be defined and met. Most of the time these requirements are specific about the functional operations of the application or the enhancement. A good example of a functional requirement is the capabilities of the newly introduced feature. Often there is no mention of performance requirements, or the performance goals are incomplete or ambiguous. Moreover, the application’s performance measurement metrics, the description of the measurement of the metrics themselves, and even the performance qualification and performance testing descriptions are rarely discussed or documented. Any performance engineer will say performance requirements are very important to capture in the requirements phase of development. And the more detailed the requirements, the better.
The next several sections present questions that performance engineers commonly ask regarding the desired application performance, thus leading to better requirements and to an improved opportunity to meeting those performance requirements.
Key Performance Goals
When capturing performance requirements, there are times when the metrics (response time, throughput, footprint) are already summarized. So, with that as a starting point, ask further questions. This section describes those questions and areas that can form better performance requirements.
First, performance of an application should be expressed in the form of a requirement for each of following performance attributes:
- Throughput performance (how quickly can the application do some well defined unit of work?)
- Latency performance (how long does it take from the time of an input stimulus until a response is received?)
- Memory footprint (how much memory does the application need?)
At a very minimum, answers to these questions should be known prior to transitioning into the implementation phase.
For a throughput performance requirement, you should expect to capture the essence of the requirements; something along the lines of “the application shall perform ‘X’ number of operations or transactions per some unit of time.” An example requirement of this form is “the application shall perform 120 transactions per second.” This is not necessarily a complete throughput requirement, but it is a good starting place.
Similar to the throughput performance requirement, you should first try to capture the essence for a latency performance requirement. It could be along the lines of “the application shall respond to some type of external stimulus, or some kind of input, and return a response within a specified unit of time.” An example of a latency performance requirement is “the application shall produce a response to an incoming request within 50 milliseconds.” As was the case with the example throughput requirement, this is not necessarily a complete latency performance requirement.
Likewise for memory footprint requirements, a memory footprint requirement is one that communicates the amount of memory the application is allowed to use. An example of a memory footprint, or memory usage requirement, is the application shall not use more than 10 GB of Java heap. Again, for Java, this requirement leaves quite a large room for fine tuning the memory usage.
Clarifying Throughput Performance
Once you have a throughput performance goal for the application or feature under development, there are additional questions to ask. These questions are targeted toward fine tuning the throughput requirement and will help improve the chances of the application meeting or exceeding its performance expectations. Some additional questions to consider asking include:
- Should the performance goal be considered the peak performance goal? Or is the performance goal a throughput goal the application shall maintain at all times?
- What is the maximum load the application is expected to take on?For example, what is the expected number of concurrent or active users, or concurrent or active transactions?
- If load taken on by the application exceeds the expected load, can the throughput fall below the performance goal?
- If it can fall below the performance goal, how long can it fall below the performance goal? Or how long is the application expected to meet performance goals at peak, or at load levels higher than expected levels?
- In terms of CPU utilization, is there an expected amount of CPU, or a limit on the amount of CPU that can be used by the application at various load levels?
- If there is a limit of CPU consumption, can that amount of CPU be exceeded, and for how long is it acceptable to exceed that amount?
- How will throughput of the application be measured? And where will the computation of throughput be done?
The last question is a very important one. Getting clarity on how and where throughput will be measured can be very crucial to meeting the throughput performance goal. There may be differences amongst those who have a stake in the performance as to how and where throughput is measured. There might also be differences in opinions on the other questions listed here, too.
Clarifying Latency or Response Time Performance
Similar to the throughput performance goal, latency or response time performance goals should be documented and well understood. The first step is to define a response time goal or requirement as described earlier. A goal that simply captures an expected response time for requests is a good starting place. Once that initial performance goal is established, additional probing questions can be asked to further clarify what is expected in terms of response time and latency. Additional questions include:
- Is the response time goal a worst-case response time goal that should never be exceeded?
- Is the response time goal an average response time goal? Is it a percentile such as a 90th percentile, 95th percentile or 99th percentile response time?
- Can the response time goal ever be exceeded?
- If so, by how much can it be exceeded?
- And for how long can it be exceeded?
- How is response time going to be measured?
- Where will response time be measured?
The last two are very important questions and should be explored in detail. For example, if there is an external load driver program involved, it may have built-in facilities to measure response time latency. Should you decide to use those built-in facilities, if you have access to the source code, take a look at how the response time is computed and reported. As mentioned earlier, be leery of response times that report averages and standard deviations. Response times are not normally distributed. Hence, trying to use statistical methods that assume normally distributed data will lead to improper conclusions.
Ideally you should collect the response time data for each and every individual request and response. Then, plot the data and order it in a way that you can see percentiles of the response times including worst-case response time.
If response times are measured internally in the server application, you should immediately be suspicious if you are attempting to report response times as observed by someone who uses the application metrics as offered by the server application and not the system-wide or client-side metrics. Let’s delve deeper. Consider for the moment that you are interacting with the server application. You issue a request to the application. But before the request is fully read by the server application, suppose a garbage collection event occurs, which takes two seconds. Because the request you have issued has not been fully read by the application, the incoming request timestamp has not been computed by the application. As a result, the request you issued just experienced a two-second delay that will not be reported in response time latency. Hence, when response time latency is measured within a server, you should not use the data to represent response time latency as seen by a client application interacting with the server application. There can be queuing that occurs between the client and server that is not measured in the server’s response time computation. The response time measured within a server is really measuring the latency from the arrival timestamp (after the incoming request has been read) all the way through until the response timestamp is taken (typically after the transaction completes and a response to the request is written).
Although it was not mentioned earlier when discussing throughput, much of what is said in this section with respect to how response time latency should be measured is applicable to measuring throughput as well.
Clarifying Memory Footprint or Memory Usage
Similar to the fine tuning of the throughput and latency requirements, memory footprint requirements, or the amount of memory the application can use, should also be documented and well understood. As in the cases of throughput and latency, the first step is to define a memory footprint goal. In other words, how much memory is expected to be used or consumed? A goal that simply captures an expected Java heap usage is a good starting place. Once that initial goal is established, you can ask additional probing questions to further clarify what is expected. These additional questions could include:
- Does the requirement of the expected amount of memory to be used only include the amount of Java heap that it is expected to be used? Or does that amount also include native memory used by the application or the JVM?
- Can the amount of expected memory consumption never be exceeded?
- If expect memory consumption can be exceeded, then by how much can it be exceeded?
- And for how long can it be exceeded?
- How is memory consumption going to be measured? Will the metric include the resident memory size of the JVM process as reported by the operating system? Will it also include the amount of live data in the Java heap?
- When will memory consumption be measured? Will it be measured when the application is idle? When the application is running at steady state? When it is under peak load?
Asking these kinds of questions will proactively intercept some potential misunderstandings from various folks who have a stake in the application.
When developing a new application or enhancing an existing one, the chances of meeting its performance goals can greatly improve by investing some additional time to refine requirements for throughput, response time latency, and memory footprint requirements. By pulling in folks who have a stake in the application or the enhancement, and having discussions that probe deeper into the performance goals for the application or the enhancement, you will better communicate to everyone involved in the performance requirements, how performance will be measured, and how performance will be tested. In short, the more detailed the requirements for each of the three performance attributes, (throughput, latency and memory footprint), the better the clarity of the performance requirements document.
Also invest in developing a performance test plan at the same time as answers to the probing questions on throughput, latency and footprint are discussed. Then share the test plan with the folks who have a stake in the application. Include in the test plan how the performance test plan will be executed and how each of the performance metrics will be measured. You will often find that there will be a disparity amongst folks’ interpretation of a performance requirement, how performance will be measured, and how the performance test will be executed. Getting clarification on these at requirements definition time will greatly increase the chances of everyone being happy when the development is complete and the application is deployed.