Introduction to Load Testing
What is Load Testing?
Load testing (including stress testing, spike testing, and stability testing) is the technique of simulating real users interacting with your site, to see how it performs under heavy load.
By load testing your site, you can greatly reduce the risk of site failure in the event of a traffic surge in the future.
The Upside & Downside of Heavy Traffic
We’ve all seen websites crash under heavy traffic.
Traffic spikes can occur when you launch a new product or feature, run a marketing campaign, or get featured on a popular blog or news site. Sometimes these surges in traffic result from careful planning and execution, and sometimes they take you completely by surprise.
High traffic events are high stakes for your business! If your site handles the extra traffic and delivers a great experience to everyone, you win. But if your site crashes, it will be a stressful and costly failure.
Reducing Risk from High Traffic Events
Load testing is a way to prepare for heavy traffic, by simulating a lot of users hitting your site at once.
By testing ahead of time, you can find out how your site will respond when the flood of real users hits. You’ll get advance warning of your site’s limitations.
Even better, you can use the findings from your load tests to tune and optimize your site until it performs flawlessly.
Load and stress testing can help you answer questions like these…
- Will my site perform well on our busiest day of the year?
- Do I have enough cloud infrastructure or hardware to run this application at scale?
- Is autoscaling working properly?
- Does my app deliver fast response times and a good user experience even under peak load?
- When pushed to the breaking point, does my application recover gracefully, or crash hard and lose data?
- Are there concurrency issues or “heisenbugs” in my application that only surface under heavy load?
- Do I have memory leaks or resource exhaustion that appears after extended usage?
- Are our redundancy and failover systems in place and working properly?
Answering these questions with concrete data is critical if you want to deliver a fast and seamless user experience to your customers.
Acceptance Load Testing
A primary goal of most load testing efforts is to answer the question: will my site deliver acceptable performance even under peak load?
To have a chance at answering that question, you’ll first need specific requirements defining:
- The step-by-step user behavior or behaviors to be tested.
- The worst-case acceptable performance threshold for an individual user.
- The peak number of concurrent users or transactions per second.
There are different ways you could state your concrete requirements, but it’s important to address all three dimensions: user behavior, acceptable performance thresholds, and peak load.
If you’re working with colleagues or stakeholders, the process of gathering these requirements might not sound fun, but it’s an opportunity to align everyone’s expectations around the testing effort.
Having clear requirements makes it easy for everyone to agree whether the load test passed or failed. That way, everyone on your team arrives at the same conclusion about whether the site is ready for heavy traffic.
Exploratory Load Testing
Load testing techniques can be useful even if you don’t yet have specific performance or scalability requirements.
In the early stages of a project, you may want to run some exploratory load tests just to see what performance and scalability bottlenecks you encounter.
Creating some realistic test scripts and gradually ramping up the load (with more and more concurrent bots) will reveal bottlenecks that you can resolve through tuning or code changes.
Repeatable Load Testing
Tuning a site for better performance is a methodical process, and shouldn’t be done haphazardly. This gets much easier when you have a way to measure the impact of each tuning change.
Running your load test in between each change to your site and environment can tell you whether the change made things better, worse, or had no effect.
In fact, tuning a site for better scalability in the absence of a repeatable load test would be clumsy and error-prone.
Other Types of Load Testing
You can use load testing tools and techniques in different ways, to test other dimensions of your site’s performance and scalability.
The term “stress testing” is often used interchangeably with load testing, but there’s a slight distinction.
Stress testing is less concerned with determining the breaking point or validating requirements, and more concerned with what happens when the site is pushed beyond the point of failure.
Some applications recover cleanly on their own once the excess traffic subsides. Others require a restart. In some cases, heavy traffic can even leave a web application in a broken state, with stuck database transactions or exhausted resource pools. Even worse, there are situations where excessive load can even leave half-baked, damaged, or incomplete data in a database, resulting in permanent damage to your customer data.
Stress testing a web application with excessive load can tell you if your site can recover gracefully on its own, without permanent damage, once the traffic subsides.
Web applications sometimes run smoothly for a while and then bog down as some resource is gradually exhausted. Eventually, when the resource is fully exhausted, the application crashes altogether.
A stability test simulates a prolonged period of moderate-to-high load, to ferret out possible memory leaks and resource exhaustion.
If you run an extended load test and see performance get gradually worse over time, it’s an indication that some resource is being leaked or exhausted. It might be a hard limit like physical memory or heap memory, or a soft limit like an internal cache filling up or database connections leaking.
A stability load test makes it a lot easier to catch resource exhaustion and fix it before it causes problems in production.
If your site receives a quick spike in traffic, will it handle the spike gracefully and recover quickly afterwards, or will it break down and remain sluggish?
It’s fairly common for web applications to suffer bad performance for a while after a large spike in traffic, sometimes requiring a restart. This might happen because an internal resource is exhausted, when old requests are queued up causing backpressure, or when your autoscaling configuration starts to snowball.
Since the aftermath of a traffic spike is so unpredictable between applications, a load test to simulate a short but severe traffic spike is a way to find out.
Continuous Performance Testing
If your team frequently releases changes to your web application, it’s good to measure the performance and scalability impact of each revision.
By running the same load test against each build, you’ll be able to compare high-level performance metrics to see if performance has improved or degraded. If the test reveals a change in average response times, nth-percentile response times, or error rate, you can investigate and take action accordingly
Many load testing tools can be wired into a CI pipeline for continuous load testing.
The Load Testing Process
At a high level, the process of running your first load test will go something like this: first create a script, then run a load test, and finally interpret the results (and most likely, repeat the test quite a few times).
Creating Test Scripts
Every load testing tool is different, but in general you’ll start by creating a script that tells the tool how to simulate a real user visit. Usually, scripts should imitate a real user as closely as possible, with realistic wait times and a typical flow through the site.
Some tools (including Loadster) support testing with real web browsers, while others require scripting to be done at the protocol layer. Your choice of tool might vary depending on whether you’re testing a static website, dynamic web application, or API.
Running Load Tests
For a test to be a load test, it needs to simulate multiple users hitting your site at the same time. Load testing tools spawn many bots (virtual users) to execute your scripts in parallel.
Many free tools generate the load from a single process that you run on your own machine, and this often works fine for quick load tests with smaller amounts of load. For larger distributed load tests, it can be helpful to have a service that spins up cloud instances on demand from your choice of cloud regions, and takes care of the deployment and test infrastructure for you.
Analyzing and Reporting Results
Nearly every load testing tool generates some kind of output, with the tool’s measurements of how well your site performed and what kind of errors occurred. You’ll want to review these metrics during and after a load test to tell if the test was a success and what kinds of changes might be needed.
Data analysis and reporting is an important and sometimes time-consuming part of load testing. Some tools do a lot of the reporting and data analysis for you with automatic graphs and reports, while others take the philosophy of emitting raw data that you parse and analyze yourself.
Getting Started with Load Testing
At this point you might be a bit overwhelmed by all the things there are to load test.
It’s true, if you actually load tested all the things all the time, it would take a substantial effort. We actually don’t recommend that.
The good news is, reducing the risk of crashing from high traffic events is one of those things where 20% of the effort yields 80% of the results. You can reduce your risk substantially with even a few rounds of load tests.
Load testing is actually quite fun too, so we recommend selecting a tool and jumping right in. Before you know it, you’ll be running load tests and improving your site’s performance and scalability.
Your future self will thank you when your site doesn’t crash.
If you’d like to share feedback about this guide or run into challenges with your load testing, we’d like to hear from you at firstname.lastname@example.org.