I was pretty surprised when I saw the lights go off on my Rackspace servers in the DFW data center, in Dallas.
The outage was covered by Laughing Squid, and made it onto a lot of big tech news sites such as TechCrunch, GigaOm, Valleywag, and O’Reilly Radar. 37 Signals and other well known web companies got wiped off the face of the earth.
It was embarrassing for me, since I just handed over a new web app to the customer for testing, which relied on a web service running on one of my DFW servers.
I signed up with Rackspace a couple of months ago, and was impressed by the confidence with which they spoke of their 100% guaranteed uptime. “Not 99.99999999999?” I asked. 100%, they assured me.
Down for three hours? That puts my uptime to date at roughly at 99.791%. Sorry Rackspace, but my Nintendo Wii has a better uptime than that. If you want to continue touting your “fanatical support”, you will have to do better.
And as for the apology:
We cannot promise that hardware won’t break, that software won’t fail or that we will always be perfect. What we can promise is that if something goes wrong we will rise to the occasion, take action, resolve the issue and accept responsibility. If you are a Rackspace customer and don’t think we’ve lived up to this promise at anytime during the outage, please let your Account Manager know.
You forfeit the right to this excuse when you promised 100% uptime. Why do you think everyone else offers a bunch of 9′s? If it hadn’t been a lorry crashing into some transformer, it would have been a giant meteorite. This is God’s way of telling you to listen to your sys admin, and not your marketing guy.
p.s., I will accept a free iPod touch as a gesture of good will.
Update: Got the phone call from Rackspace ~1 hour after writing this. That is fanatical support, since whoever read this post had to find out who I was and get in contact with my account manager in that time. I’ll just clarify that, having been there myself in the past as a sys admin, and also working for a broadband provider, I know full well that these things happen. I am aware that the real test is the response when something like this does happen, and it looks like Rackspace did well to get everything sorted quickly. My issue is with the (now mathematically impossible) 100% uptime claim, which no right-minded service provider should give, and Rackspace now no longer have the right to maintain.
p.p.s., the iPod touch was just a joke