Load Testing Best Practices & Pitfalls

No test or simulation can completely eliminate the risk of a site failure. Even the most robust and scalable sites in the world are vulnerable to black swan events (human error, acts of nature, or otherwise) bringing them down once in a while.

But with some testing in the right areas you can greatly improve your chances. And if you load test the right things in the right ways, it’s actually not that difficult to reduce the risk of a crash due to high traffic by 90% or even 99%.

The important things for making your load testing count are…

Accurately Simulating User Behavior

Do you understand how your users behave on the site, well enough to automate it?

Certain user actions can be a lot more time-consuming and taxing to your web application than others. As far as your site’s infrastructure is concerned, serving up static pages might be a breeze, but a shopping cart checkout flow would be more expensive because it requires maintaining user state, accessing a data store, communicating with a third party payment processor and fraud detection services, and so on.

A good load test should simulate the important user flows on your site as closely as possible, with an emphasis on the most risky and expensive actions. Create at least one test script for each common user flow, and make sure to parameterize the script with dynamic data when necessary, so that each user journey is unique. Testing with unique user data is important – unless your site is very simple, you’ll want to avoid the pitfall of just sending a flood of identical requests and calling it a load test.

Do:

  • Take the time to understand the user flows that are most important to your site’s performance and scalability, and simulate them accurately in your script.
  • Include all relevant steps in the flow, so your bots hit your servers just like a real user would.
  • If you’re testing an API or dynamic web application, parameterize any request data that should be unique per user.
  • Play the script with a single bot to make sure it runs properly, before launching a load test with many bots.

Don’t:

  • Remove wait times or leave inaccurate wait times in your script, because unrealistic wait times will cause unrealistic results.
  • Load test a dynamic site or web form by submitting the exact same thing every time from every bot.

Quantifying Your Scalability Requirements

The term “requirements” has a somewhat looser meaning when it comes to load testing than it does with functional testing. Because performance and scalability are multidimensional and multivariate, it can actually be quite tricky to state the requirements as binary pass-or-fail assertions.

You’ll probably start with rather simplistic requirements from your partners and stakeholders, if you’re lucky enough to even get that. They’ll probably say something like:

  • “The landing page should load in less than 1 second.”
  • “We need to process 7500 registrations per hour.”
  • “Response times should be less than 2 seconds.”
  • “We need to be able to handle 3X our previous peak load.”

These requirements all sound pretty reasonable on the surface, right?

The problem with all of these requirements is they leave too many things open to interpretation. Some of them state the acceptable response time with no mention of load; others state the required throughput with no mention of acceptable response times. And not all of them describe what user behavior is being tested in the first place.

In fact, all of these requirements (as stated above) are likely to pass under some circumstances and not under others.

A useful scalability requirement needs to specifically account for at least three things: the user behavior, the load on the site, and the acceptable response time.

With that in mind, let’s try rewriting the original requirements above to be a bit more specific:

  • “The landing page should load in less than 1 second, when 5000 concurrent users are performing the 4-step Checkout flow.”
  • “The system needs to process 7500 registrations per hour, while maintaining an average response time of 1.5 seconds.”
  • “The 90th percentile response time should be less than 2 seconds, when 10,000 concurrent users are browsing the site on a variety of key pages.”
  • “We need to be able to handle 150,000 home page views per hour, with an average response time of 2 seconds or less.”

These new requirements are more concrete because they now account for the dimensions of user behavior, load, and acceptable response time. It’s now easier to conclusively determine whether they passed or failed, and leaves much less open to assumption or misunderstanding.

Load Testing in Your Real Environment

Your site’s infrastructure and environment will have a big impact on its performance and ability to handle sustained levels of throughput.

Ideally, the environment you test should have the same hardware and software configuration (virtual or physical) as your production site. It can even be your production site – as long as you’re aware of the risk that your load test might crash it.

If you’re unable or unwilling to load test your actual production environment, it’s crucial to test one that’s as similar as possible. Attempting to extrapolate load test results to a different environment than the one that is tested is risky, and should be avoided! Scalability of complex systems is rarely linear.

Interpreting Load Test Results

The outcome of a load test is impacted by the assumptions that went into it. When you share test results with your colleagues, try to call out all inputs and assumptions as possible caveats, because they affect the applicability of your test results.

Pro Tip: As a tester, documenting all the assumptions that went into your test is not only responsible, it’s also a good way to cover your ass!

Examples of inputs and assumptions that you should call out in your communications with stakeholders include:

  • The specific user flows that you simulated in your test scripts
  • The servers, configuration, and network infrastructure that you tested
  • The specific version of your site or application that you tested
  • The amount of load, both in terms of simulated users (bots) and throughput (transactions/sec, bytes/sec)

Remember, the outcome of your test is only as meaningful as the assumptions that went into it!

A useful load test is one where you can state, with a high degree of confidence, that given the assumptions listed above, the site passed or failed the requirements.

Selecting the Right Testing Tools

Lots of tools can generate load against a server. Besides Loadster, people have run successful load tests with homegrown solutions, open source libraries and tools, and other commercial tools.

Depending on your constraints (time vs. money, simplicity vs. flexibility, etc.) you might choose a different tool or even a combination of tools. Most of the load testing concepts and best practices apply regardless of what tool you use. However, in this manual we’ll mostly focus on the specifics of load testing with Loadster.

Loadster is a strong choice when you have some budget for tools, but you don’t have unlimited time to deploy and configure your own test infrastructure, parse and analyze raw data to understand your test results, and draft long manual reports. To borrow an old phrase, Loadster’s aim is to “make the easy things easy and the hard things possible”.