Skip to main content

The "Grand National" Effect - dealing with traffic spikes

The Grand National is one of the largest events in the UK horse racing calendar, and that means hundreds of million of pounds at stake for the online bookmakers and gambling sites.

It also means a major headache for the web operations teams as they deal with the inevitable traffic spikes hammering their sites, at a time when availability and performance direct equate to money won or lost.

Site Confidence monitored the home pages of ten of the top ranking online gambling sites on the Grand National Saturday, and some fascinating patterns emerged in the lead up to the race (16:15 BST).

Firstly, lets look at the Contenders…

image 

You can immediately see a range of availability and performance, with some of the traditional “bricks and mortar” bookmakers considerably slower than the pureplay dotcom’s like “DotCom #1” and Bet365.

Option 1 – Do nothing… and #fail

image

The first option is to “do nothing and hope for the best” approach, as seen in the graph above for William Hill, a traditional “High Street” betting shop.

No changes to the page size (light grey) and highly erratic page download speed giving a poor user experience to their customers… and presumably significant revenue leakage.

Option 2 – Peak Capacity

image

Bet365 appear to have weathered the storm without a blip – rock solid response time across the critical period (albeit with a “light” home page anyway).

“DotCom #1” also managed to keep performance fast (average around 2.5s with some ~5 sec spikes) whilst still serving up a “full sized page”.

Presumably both of these “pure play” dotcom sites have a capacity planning model that has sized their infrastructure and application performance at a sufficiently high level to be able to handle the “peaks”, albeit potentially at the price of carrying excess capacity during “average” periods.

image

Option 3 - Lighten up to speed up

image

A number of sites took a middle line – reduce the page weight and serve a “stripped down” version of the site that could be served within a lower overall capacity. “High Street #3” (above) stripped their site from ~300Kb down to a super lean 70Kb page.

Blue Square took the same approach, cutting from ~500Kb to ~300Kb… but still suffering some performance hiccups.

image

Why good deployment processes count…

One other thing to note… whilst Option 3 is a good balance between “losing money when your site fails” and “losing money by carrying capacity you don’t need it does highlight the need for good deployment processes… because otherwise you can end up with an inconsistent site (page size) across your web farm.

In the two graphs below you see a pattern of differing page sizes (presumably as the monitoring agents hit different servers in the server farm)… normally a sure sign of a site that has not been deployed consistently across the web farm!

Interesting the first (“High Street #1”) suffered the problem when they rolled out the “light” version, and the second (“High Street #2”) when they tried to roll back to the “normal” version.

Good reasons to review both the deployment plan AND the rollback plan…

image

image

Comments

Popular posts from this blog

So what else does Operations do? Well, there is a whole organisation run by the UK govermnent to help answer that question! ITIL , or the IT Infrastructure Library, is a library of best practice information that basically tells you everything you need to do to run an IT department. Similarly developers have development methodologies such as RAD, JAD, Agile/XP, and Project Managers have PM methodologies such as Prince 2, PMBok etc to cover off their areas in more specific detail. ITIL breaks it down into 7 key areas: Service Support - deals with the actual provision of IT services such as the service (help) desk, incident management, problem management, release management etc Service Delivery - deals with ensuring that you can continue to DELIVER the service support functions with things like contigency planning, capacity management, service levels etc The Business Perspective - helps to ensure that the IT function is aligned with the organisation's business strategy and that how to...

Top 13 Website Crashes of 2010?

I was doing a bit of research for an article and I started compiling a list of high-profile website crashes in 2010. Pingdom have published a list here - http://www.readwriteweb.com/archives/major_internet_incidents_and_outages_of_2010.php as have Alertsite here - http://www.huffingtonpost.com/2010/12/29/the-biggest-web-outages-o_n_801943.html But I decided to compile my own list from a more UK-centric perspective and came up with my “baker’s dozen” below. # Site Date News Link 1 National Rail Jan-10 http://www.theregister.co.uk/2010/01/05/rail_chaos/ 2 Outnet Apr-10 http://www.guardian.co.uk/lifeandstyle/blog/2010/apr/16/outnet-sale-website-crash 3 Apple (iPhone 4 Launch) Jun-10 http://www.dailymail.co.uk/sciencetech/article-1286756/Apple-iPhone-4-pre-order-Website-crashes-new-iPhone-goes-sale.html 4 ITV.com (World C...

Using Gmail aliases to create multiple test email accounts for QA

Came across this today when someone wanted to know how to create multiple email test accounts without involving their IT department (don’t ask!) or managing multiple free email accounts. Gmail allows you to create aliases for your email address automatically. For example, if your Gmail account is joe.bloggs@gmail.com then joe.bloggs+test.case01@gmail.com will work for your account – anything after the “+” sign can be used to create an alias.  These emails will be delivered to your normal Gmail inbox. So when you are testing you can use +test.case01, +test.case02, +test.case03 and so on as your test email addresses (assuming that your application doesn’t get upset at the use of the “+” in an email address. It shouldn’t, its a valid character in the RFC http://tools.ietf.org/html/rfc3696#page-5 ) So lets say you want to filter these test emails and label so they don’t get lost in your Inbox. Easy, just use a Gmail filter and search for a whatever common “stem” you used ...