Capacity Planning For Web Apps
What’s Capacity Planning?
Capacity planning means understanding how much traffic your web application will receive, and planning to ensure it has the capacity to handle the traffic.
The term “capacity planning” has heavy enterprise connotations. It brings to mind a process-heavy, meeting-intensive, lengthy process that takes a lot of time to do and culminates in some kind of document detailing the hardware and staffing requirements for hosting a web application.
The capacity planning process can be lengthy and onerous, if your situation warrants it… but it doesn’t have to be. Planning for capacity can be just as important for small agile software teams as it is for enterprises.
Is Capacity Planning Really Necessary?
If your web application is trivial, experimental, or you have no idea about expected future usage, capacity planning isn’t necessary. Just start small and make adjustments as you go.
However, capacity planning becomes important if any of the following is true:
- You expect increased usage in the future.
- The application is performing poorly under current usage.
- The hosting environment is expensive and might be overpowered.
These situations present an opportunity to plan for the present and future capacity, and adjust your application and environment accordingly.
If you don’t, you’ll either waste money (if the environment is overpowered), or fail to deliver a good experience to your users (if it’s underpowered).
Measuring Current Usage and System Capacity
If you don’t know the current capacity of your deployed web application, you’ll be unable to plan for future capacity.
How many concurrent users can the application handle while delivering acceptable response times without errors? How much throughput can it handle, in terms of key transactions per minute?
It’s wise to define capacity requirements specific to your application and users. These capacity requirements (often called “non-functional requirements”) set the bar for what is deemed acceptable.
Following are some examples of capacity requirements that could be used as a basis for capacity planning.
Maximum Baseline Response Time
This represents the longest your users should have to wait for a page to load, under ideal circumstances. Ideal circumstances basically means everything is functioning well and the system is running at baseline, meaning it isn’t under much load.
Maximum Peak Response Time
This is the longest your users should ever have to wait for a page to load, even under peak traffic when the system is under heavy load or some bottleneck is saturated. If response times ever take longer than this, it means you are officially over capacity. Capacity planning fail!
Baseline throughput refers to how many requests or key transactions the system handles in a given time window. To measure baseline throughput, you can look at your logs or analytics to get a daily or weekly total, and then divide it by the time interval. You’ll arrive at something like “22 requests per second” or “134 logins per minute” or “1200 shipments per hour”.
Peak throughput means how many requests or key transactions the system currently handles in the busiest time window. This might be once a year on Black Friday, or it might be every Tuesday morning, depending on your business. What’s important is to determine the highest throughput the system is currently known to handle, while delivering acceptable response times and error-free responses to your users. You’ll use this as a basis for estimating future peak capacity.
Estimating Future Growth and Capacity Needs
Once you have an accurate picture of the system’s current capacity and usage, your next step is to plan for future capacity.
Nobody can foretell the future, but you can estimate it based on current growth trends, and then pad your estimate by some reasonable amount to ensure that extra capacity is available to handle unforeseen traffic spikes.
You might have to work with other stakeholders at your company to understand trends and apply growth estimates. This might require lots of back-and-forth dialog, which you can guide by asking the right questions.
For instance, let’s say you work at a shipping company where the key transaction is shipping a package. Forecasting future growth would involve questions like… How many packages a day are being shipped now? What time of day is the busiest? How many packages an hour get shipped at the busiest part of the day? Does it get even busier around Christmas? What’s our monthly and yearly growth rate? Does the executive team have growth forecasting models? If the business thrives, how much could we grow in the best case scenario?
Eventually you’ll arrive at some growth estimates for various dates in the future. For example, three months from now you might expect 120% of current capacity, in six months 150%, and in 12 months 200%. These estimates will vary depending on your company and application.
Load Testing to Simulate Future Usage
At this point it might be tempting to naively apply the forecasted growth rate, and assume you need 20% more servers in three months, 50% more servers in six months, and double the servers in 12 months. But that’s probably wrong! The scalability of web applications is rarely linear.
The better way is to run load tests to simulate each level of predicted future usage on your web application, to see if it has sufficient capacity to handle it.
If the application performs well in the load test, even at higher than current usage levels, it’s likely you have sufficient capacity to grow traffic to that level.
If it performs poorly, with slow response times or errors, the application and environment are not yet equipped to scale to meet future usage. This is a valuable finding because it allows you repeat the failed test with different configurations to see what kind of environment is necessary to support this load.
You may need to repeat your load tests multiple times, with different amounts of load (to simulate different levels of future usage), and with different environment configurations (to test which configuration is best to support each amount of load).
Successful Capacity Planning Outcomes
A capacity planning effort is a success if it results in quantifiable, actionable data to guide your decisions and help you design the right-sized environment for your application.
You’ll know if the environment is too small, too large, or just about right for each stage of growth. An environment that is too small will crash or bother your users with sluggish responses. An environment that is too large wastes money with unnecessary redundancy.
You can also better anticipate what new costs and complications you will face as your traffic grows. You may even get advance warning of serious architectural issues in your application or bottlenecks that can’t simply be solved by throwing more hardware at them, giving you the luxury of time to prepare for the next stage of growth.