Skip to main content

The "Grand National" Effect - dealing with traffic spikes

The Grand National is one of the largest events in the UK horse racing calendar, and that means hundreds of million of pounds at stake for the online bookmakers and gambling sites.

It also means a major headache for the web operations teams as they deal with the inevitable traffic spikes hammering their sites, at a time when availability and performance direct equate to money won or lost.

Site Confidence monitored the home pages of ten of the top ranking online gambling sites on the Grand National Saturday, and some fascinating patterns emerged in the lead up to the race (16:15 BST).

Firstly, lets look at the Contenders…

image 

You can immediately see a range of availability and performance, with some of the traditional “bricks and mortar” bookmakers considerably slower than the pureplay dotcom’s like “DotCom #1” and Bet365.

Option 1 – Do nothing… and #fail

image

The first option is to “do nothing and hope for the best” approach, as seen in the graph above for William Hill, a traditional “High Street” betting shop.

No changes to the page size (light grey) and highly erratic page download speed giving a poor user experience to their customers… and presumably significant revenue leakage.

Option 2 – Peak Capacity

image

Bet365 appear to have weathered the storm without a blip – rock solid response time across the critical period (albeit with a “light” home page anyway).

“DotCom #1” also managed to keep performance fast (average around 2.5s with some ~5 sec spikes) whilst still serving up a “full sized page”.

Presumably both of these “pure play” dotcom sites have a capacity planning model that has sized their infrastructure and application performance at a sufficiently high level to be able to handle the “peaks”, albeit potentially at the price of carrying excess capacity during “average” periods.

image

Option 3 - Lighten up to speed up

image

A number of sites took a middle line – reduce the page weight and serve a “stripped down” version of the site that could be served within a lower overall capacity. “High Street #3” (above) stripped their site from ~300Kb down to a super lean 70Kb page.

Blue Square took the same approach, cutting from ~500Kb to ~300Kb… but still suffering some performance hiccups.

image

Why good deployment processes count…

One other thing to note… whilst Option 3 is a good balance between “losing money when your site fails” and “losing money by carrying capacity you don’t need it does highlight the need for good deployment processes… because otherwise you can end up with an inconsistent site (page size) across your web farm.

In the two graphs below you see a pattern of differing page sizes (presumably as the monitoring agents hit different servers in the server farm)… normally a sure sign of a site that has not been deployed consistently across the web farm!

Interesting the first (“High Street #1”) suffered the problem when they rolled out the “light” version, and the second (“High Street #2”) when they tried to roll back to the “normal” version.

Good reasons to review both the deployment plan AND the rollback plan…

image

image

Comments

Popular posts from this blog

Alexa has performance rankings too!

I must confess that this one had slipped past me… I wasn’t aware that Alexa was now offering performance rankings as well! Alexa now shows an “average load time” as well as a “comparitive ranking” (i.e. 51% of sites are slower). Interesting data for your next website analysis! It’s worth noting that the quoted “1.404 seconds” is about the same as the “repeat view” measurement webpagetest.org (testing from the UK node shows that www.johnlewis.com takes about 7.2 seconds to load with an empty cache, and 1.6 seconds with a primed cache, median values over 5 runs). Presumably this is an average across all Alexa toolbar users, on all johnlewis.com pages (large and small) with empty and primed caches.

Load Testing Case Study

Since we are talking about presentations here is another one I did for the Load Testing Expo in 2009. It’s a case study of how we did the performance testing for an internet-facing content/community/jobs site. Test Expo 2009 Site Confidence & Seriti Consulting Load Test Case Study View more presentations from Stephen Thair .

Real-time Web Analytics just took a step forward with Pion v4

Awesome news from Atomic Lab’s that Pion 4.0 has been released! The new Dashboard functionality looks amazing – real-time web analytics likes page views, visitor sessions etc as well as performance analytics like load time and server reply time – along with the usual Pion goodness like tag-free analytics and customer journey visualisation with the Replay feature. Can’t wait to implement this on some customer sites!