Analyzing Test Results
After you finish running a test, Loadster generates a test report.
The key performance indicators at the top of every Loadster report are:
- Duration - The entire duration of the test from start to finish.
- Max V-Users - The total number of virtual users allocated across all virtual user groups.
- Download - The total bytes downloaded (HTTP response headers + bodies).
- Upload - The total bytes uploaded (HTTP request headers + bodies).
- Total Pages - The total number of top-level “pages” successfully requested.
- Total Hits - The total number of top-level “pages” plus any included page resources successfully requested.
- Total Errors - The total number of HTTP and validation errors.
- Total Iterations - The total number of script iterations completed across all virtual user groups.
More information about each of these can be found in the report sections that follow.
The Test Report
Many of the sections in the report correspond to the ones you saw in the dashboard when you were running your test.
The summary section is reserved for you to edit yourself. We suggest writing a paragraph or two about the assumptions that went into the test (user behavior, traffic patterns, hypotheses, etc) as well as a high-level summary of whether or not the site performed acceptably.
You can enter your own notes here.
The overview section shows the test configuration, including details about each of the virtual user groups in the test. This is important if you need to compare tests and need to ascertain whether there were differences in the configuration that may have contributed to different outcomes.
This section is the high-level cumulative totals as measured by the test.
This section shows peak throughput, in terms of pages, hits, and bytes.
This tabular view of response times shows the maximum, average, and minimum response time for each URL. It also includes a total response time, which is the total time spent waiting for a response across all virtual users in the entire load test. The URLs with the highest total time may be the best optimization candidates, since they are at the intersection of slowness and frequency.
Charts & Tables
Average Response Times by Page
Average Response Times by Page is useful so you can see which of your pages/URLs are slower. In all but the simplest sites, certain pages tend to account for the bulk of the slowness. Slow pages are often your best optimization candidates. The term “page” is used loosely here and can also refer to an endpoint or anything else represented by a URL.
Network Throughput shows the rate of bytes and bits transferred per second throughout your test. Although Loadster deals with HTTP (an application layer protocol), this should be a close approximation of actual throughput at the transport layer as well.
Cumulative Network Throughput
Cumulative Network Throughput is the total number of bytes uploaded (requests) and downloaded (responses) in your test. Since the number reported is cumulative, it will climb throughout the test, especially during the peak load phase.
Transaction Throughput is the rate of pages and hits per second. For the purposes of this chart, a “page” is any top-level HTTP step in any of your scripts, while a “hit” is any request against a top-level HTTP step or one of its included resources.
The Transactions chart shows a cumulative count of the pages, hits, iterations, and errors in the test.
Errors by Type
The Errors by Type chart shows how many errors have occurred of each type. This may include HTTP errors (any response with an HTTP 4xx or 5xx status), or validation errors (which are thrown when a step fails one of your validation rules).
Errors by Page
The Errors by Page chart shows the URLs on which errors occurred. It is useful for pinpointing which of your pages or endpoints are having trouble, and by inference, which of the steps in your script you may need to revisit. The term “page” is used loosely here and can also refer to an endpoint or anything else represented by a URL.
The Error Breakdown table provides more detail on errors. It includes a longer error message as well as the exact script and virtual user that experienced the error.
The Virtual Users chart shows, for each virtual user group, how many virtual users have been running at any point during the test. The ramp-up and ramp-down phase should resemble what you configured in your scenario. Virtual users may take a bit longer than planned to exit during the ramp-down phase, because they must complete the current iteration of their script before exiting.
Load Engine CPU Utilization
Load Engine CPU Utilization shows how busy the CPU(s) are on each load engine or cluster. If the CPU remains 100% utilized for a significant amount of time, it can result in inaccurate response time measurements! If this happens, it may be a good idea to split the virtual user group into multiple smaller groups on different engines or clusters.
Load Engine Thread Count
Load Engine Thread Count is another measurement of how busy the load engine or cluster is. The thread count is directly correlated to how many virtual users the engine is running. Engines will always have at least one thread per virtual user, and more if the script calls for additional page resources to be downloaded in parallel with the primary request.
Load Engine Memory Utilization
Load Engine Memory Utilization tells how well the engine is managing its memory. This is rarely a problem, but things to look out for include very high memory usage (close to 100%) and extremely frequent garbage collection (lots of big spikes and drop-offs in the chart).
Load Engine Latency
Load Engine Latency is essentially the ping time between your Loadster and the load engine or cloud cluster. Make sure you have a fast network connection for load testing.
Sharing the Report
To share the report with your team, you can simply invite them to join your Loadster team from your Team Settings page. Anyone on your Loadster team can access test reports.
Did it pass or fail?
The main point of load testing is to determine whether your site meets the performance and scalability requirements (at least according to the assumptions made in our testing).
It can often be difficult to reduce the complicated multi-dimensional results of a load test to a single “thumbs up” or “thumbs down”. That said, here are a few questions we can ask ourselves as we analyze the results of a load test.
Were the assumptions realistic?
Going into a load test, we make a lot of assumptions. We make assumptions about how our users interact with the site. We make assumptions about traffic patterns. We make assumptions about the number of concurrent users who will try to use the site at any given moment.
Determining whether these assumptions are realistic is often a task for the product owner. At the very least, we as engineers owe it to the interested parties to explain and document the assumptions we make about user testing.
The quality of a load test result is only as good as the assumptions that went into it.
Did the test generate the target amount of load?
Scalability requirements can be stated in many different ways. We might say “the system must handle 500 concurrent users” or we might say “the site must handle 1000 hits per second” or even something more abstract like “the system must handle 6000 orders per hour”.
For a successful load test, we must often work backwards to translate these requirements into variables that we can control. How many virtual users does it take to generate 1000 hits per second? How many virtual users do we need to place 6000 orders per hour?
This may require trial and error.
After a test completes is a great time to review whether the test generated enough load to hit these targets. If the test did not generate the intended load, we may need to change the number of virtual users and re-run it. Sometimes it takes several tests to find the right parameters.
Did the virtual users report acceptable response times?
Once we’ve established that the test did indeed generate the intended amount of load, we should look at the response times experienced by our virtual users.
The average response time is important, but it doesn’t tell everything. What was the maximum response time? Did the response time remain acceptable even during spikes?
Your definition of “acceptable” may vary. Generally, we recommend aiming for sub-second response times on the large majority of requests. However, the right number is customer-dependent and product-dependent. It is up to you and your customers to determine what “acceptable” really means.
Were there errors?
The presence of errors in a test is almost always a bad sign. Sometimes the cause of the errors is mundane, like an HTTP 404 from an incorrect step in a script. Other times it is more tricky.
If socket timeouts or connection timeouts occur, it is very likely a sign that the server is overloaded. This is also the case with certain HTTP status codes like HTTP 503.
If the errors are related to heavy load, we could try reducing the load by half and re-run the test, to see if they still happen. When the cause is unclear, it might make sense to play the script with a single user in the script editor, or check the server logs for more information.