Capacity Planning (For The Rest Of Us)

Today I heard a term that I used to hear all the time, and haven’t heard at all in the last couple years. The term is capacity planning. Sounds very important doesn’t it! It’s one of those “enterprise” words that echo through conference calls in big companies, and are also really useful to consultants selling their services. Some of us may not be huge fans of the culture and slow, lumbering processes the word evokes… but whatever you want to call it, capacity planning is really crucial.

Let’s say your team is building a new web application, and you also need to make suitable plans for hosting and infrastructure. Obviously, you’ll want to optimize the setup to make efficient use of resources (money, hardware) but without risking bad performance or a system crash under load.

Process Inputs

Here are some things we always want to define going in.

Maximum acceptable response time (baseline) – This represents the longest we can expect our users to wait for a page to load, under ideal circumstances. Ideal circumstances basically means everything is functioning well and the system isn’t under much load.

Maximum acceptable response time (peak load) – This is the longest we can ever expect users to wait for a page to load, even when the system is under heavy load or some bottleneck is saturated. If response times ever take longer than this, it means something is broken or we’re over capacity. Capacity planning fail!

Max throughput at peak load – Just as important as the previous two, this refers to how many users/pages/hits/transactions the system can handle in a given time window, at peak load. In my experience, this is one of the hardest things to know ahead of time, particularly with a new application. Your mileage may vary as you attempt to work backwards to get at the answer. In big companies where “the business” is a separate group or entity, I often have to ask pointed questions and deductive reasoning. How many packages a day are being shipped now? What time of day is the busiest? How many packages an hour do you think get shipped at the busiest part of the day? Does it get even busier around Christmas? You get the idea. If you end up estimating, aim a little high to be on the safe side.

Expected user behavior – This is the vaguest out of the four, because it’s not a number. It represents the actual steps our users take through the application. Most applications support more than one “behavior”. For example if you’re running an e-commerce site, you could determine that 60% of users are browsing the goods, 20% are putting stuff in their cart and then leaving, and another 20% are putting stuff in their cart and buying stuff. I usually don’t worry about the rare, long-tail user behaviors unless I have some reason to believe they will have a large performance impact.

Process Outputs

Since we’re making a decision about our application’s hosting, configuration and infrastructure, here are the outputs that are crucial to that decision. If your application is really special there might be others also, but the following are the main ones that any capacity planning effort should address.

Server & network resources – This actually includes a lot of things! But I generalize it to mean what size and type of platform your application will run on. How much RAM, how many CPUs? What kind of disks? How much network bandwidth? You get the idea.

Software tuning & configuration – What tuning parameters do you use for your application (and its friends, the database and web/application server) to guarantee suitable performance, scalability and stability?

Wiggle room – If you’re smart and even the least bit optimistic, you aren’t just planning to meet current capacity. You have to build in some wiggle room to handle traffic spikes, future growth, etc. A reasonable amount of wiggle room might be 30-70% above today’s maximum scenario.

Plan to grow – Once our wiggle room starts to run out, what’s the plan to grow bigger? It’s really helpful to know ahead of time what our next steps will be. Will we scale vertically by buying a bigger server and tweaking some configs? Or do we have an architecture that allows us to scale horizontally instead? There are times when either approach makes sense, and the determination of which fits us better is a key output of capacity planning.

Practical Application

Well, those are my super-simplified thoughts on capacity planning. Being a simple man, even these points give me plenty to think about when deploying an application and making sure it doesn’t flop or break the budget. Focusing on the inputs and outputs helps me isolate the many variables in between.

A couple of readers have been nudging me to dispense with the theory and dive deeper into hands-on application of this stuff, and I’m anxious to do the same. My next few posts will have some step-by-step walkthroughs of those parts of capacity planning that Loadster helps with: scripting, load testing, and optimization. Even if you have another tool instead of Loadster, many of the techniques are the same so I hope you’ll follow along!