Do you run your own web site? Do you know if it can handle a traffic spike? Do you know if it performs as well as it could?

There is a huge amount of work involved in getting a web site built and deployed. Once you decide to move forward with an idea for a site you have to determine what exactly you want to build, what features you want it to have, design the look of the site, write the code, write unit tests to make sure everything works correctly and continues to work in the future, figure out how to let people know about the site and so on.

A huge amount of time and effort can go into just building a site and getting it deployed for everyone to see. Now imagine one person has to do all that work and it is easy to see how some aspects might be delayed or outright forgotten about.

If, like me, you have worked for larger companies that have their own web operations staff or you are just new to running your own web site this information should give you a better idea of how well your site performs and how you can make it perform better.

Please also be aware that I don't claim to be an expert in this area. If you see some glaring problem with my recommendations feel free to point them out in the comments.

Determine Site Bandwidth

The first step is to find out what your bandwidth limits are. I host this site on a Linode VPS which provides 200gb of bandwidth per month. Some shared hosting services are providing unlimited bandwidth, but if you are getting enough traffic on a regular basis to exceed 2-300gb of bandwidth a month you will probably be causing problems for the other customers on the shared host and you might get asked to move anyway.

Most of the calculations I will be doing will be in kilobytes to keep things simple. One gigabyte is equal to 1,048,576 kilobytes, so the total amount of kilobytes I can send without incurring bandwidth charges is 200 * 1,048,576 = 209,715,200 total kilobytes/month.

There are numerous free bandwidth calculators around, but they all assume you know how many page views you expect in a given month. This article will look at bandwidth from a different perspective. Given my bandwidth limit, how many page views can I send per month.

Now you need to find the average size of your site's pages. You can't just look at the size of the HTML file to estimate the size. The full HTTP response is going to include CSS, Javascript, images, and any other resources your page uses. This is where Pingdom Tools comes in handy. You type in your site address and Pingdom will test the load time of that page. It includes a performance score, total number of requests embedded in the page, load time, and most importantly, the page size. Another fee site, GTmetrix provides the same service as Pingdom but shows two different performance scores. Both sites give in depth information about the performance score and how to improve your score, some of which I will mention, but most is beyond the scope of this article.

To get an average page size for your site choose one of the tools above and test a good sample of your pages. Test smaller pages as well as larger pages to get a good average. I ended up with an 186.3kb average after testing my pages.

Now its just simple math to figure out how many page views I can handle before I reach my bandwidth limit. If I divide the total kilobyte/month by my average page size I will have an approximate number of page views per month that will stay within my bandwidth limit, so 209,715,200 / 186.3 = 1,125,685 page views per month.

That number is approximate of course. My home page will fluctuate in size depending on how large or small the current blog posts are. It gives me a ball park figure to work with however.

My site isn't getting anywhere near that amount of traffic right now, but what happens if I get a spike from Hacker News. That is the most likely site that I would be getting a traffic spike from, so it seemed like the best place to start. After some searching and reading I found that a Hacker News spike can range anywhere from 7000 - 60,000 page views in a day depending on the popularity of the article.

If I had that many page views a day would I go over my bandwidth limit? 1,125,685 / 60,000 = 18.76. I could sustain heavy traffic for almost 19 days before exceeding my bandwidth. That isn't bad. It also isn't very likely that I'll ever get that many page views for that long. How many page views could I handle a day without going over my limit? 1,125,685 / 31 = 36,312. I could handle roughly thirty six thousand page views a day for a month.

Now we have some simple equations to get an idea of how much bandwidth our web pages use and how many page views we can generate per month:

  • limit-in-gb x 1,048,576 = limit-in-kb
  • limit-in-kb / average-page-size = approx-page-views-per-month
  • limit-in-kb / high-page-views = days-until-bandwidth-is-exceeded
  • limit-in-kb / 31 = page-views-per-day-within-bandwidth

Optimize Bandwidth Usage

Now that you have a good idea what your web sites page sizes are and how much bandwidth you use it is time to look at reducing and optimizing your bandwidth usage as much as possible.

Reduce Page Size

The very first thing to do is reduce the page size as much as you can.

  • Minify CSS and Javascript - The browser doesn't care about nicely formatted CSS and Javascript files and the extra whitespace just slowly chews up your bandwidth. There are tools on the Internet to minify your CSS and Javascript.

  • Combine CSS and Javascript Files - The fewer requests the browser has to make to the server the faster the page will be visible in the browser. This doesn't strictly reduce overall page size, but if you minify your CSS and Javascript it makes sense to combine any files you can at that point as well.

  • Optimize Images - If there are any background image files that are used in your CSS and used on every page you should make sure those image files are optimized and made as small as possible.

  • Remove HTML Comments - Comments are generally for the site developers and not the viewers, so removing or at least keeping the size of HTML comments low will shave off some more from your page size. If you are using an HTML template system like Haml or ERB you can use the template language comment which will be removed when the final HTML is generated. That way your developers can comment as much as they like and none of those comments will be sent to the browser.

  • Remove Duplicate Code - Check through your CSS and Javascript to make sure there are no duplicate CSS rules or Javascript functions.

  • Enable Compression - Most web servers support compression now, but it is generally not enabled by default. Investigate you web server documentation and configuration settings to make sure your pages and assets are being compressed before being sent to the browser.

Enable Caching

There are three types of HTTP caching, browser cache, proxy cache and gateway cache. In the context of this article we are interested in browser caching. Browsers set aside space on the users system to store assets that have been downloaded from a web site, but a browser will only store assets in its cache that have some kind of HTTP cache header. (Expires:, Cache-Control:, Last-Modified:, Etag:)

Many web servers will add the Last-Modified: HTTP header automatically for static files such as images, CSS and Javascript. The Last-Modified: header is not enough information on its own however. If a static asset only has the Last-Modified: header the browser will cache it, but it will have to connect to the original web site to verify that the asset has not changed.

To avoid the validity check add the Expires: header. If an asset has an Expires: header the browser will save the expiration time along with the asset in the cache. If that asset is requested again and it has not yet expired the browser skips the validity check saving a round trip to the web server.

One trick many sites use it to add an Expires: header to static content such as images, CSS and Javascript with a expiration date far in the future; several months or years. This will make the browser keep the asset and not revalidate it for a long time. All web sites grow and change over time, and you will probably change some of those assets eventually. How do you get the browser to get updated assets if they aren't expired yet? Use fingerprinting. You can add a version number to the file and update your site to use the new file name. This will automatically invalidate the existing version of the file in the browser cache and get the new and updated version of the file.

Actually adding the cache headers to your web pages and static assets is going to be specific to your site. You can do it directly in the web server or in the language or framework you are using.

HTTP caching is a large subject and I can't cover it all here. If you are interested in the details I have found several good articles:

That's it for this article. I hope you have a better idea how your bandwidth is being used and some ideas on how you can improve you pages to be more efficient.