Well, I slacked off a bit on the blogging, but I’m hopefully back, re-energised with technology after a great week at Cisco Live!
One of the things that has been bothering me though, isn’t related to Cisco Live but came up during a talk at this event which is availability metrics.
Everyone in the IT industry has heard about them, although very few of our clients have actually worked out what they mean and I’d like to bring some focus to those things today.
First, let’s describe how you come up with availability as a figure (generally represented as the 9s):
Availability = Time of Uptime / (Time of uptime + Time of Downtime)
Let’s start with my favourite myth – 99% availability – it sounds good right? Like if you said to me that my internet service was 99% available I’d likely be rapt with this and not question the outcome but in real terms 99% availability is equal to in any given week an outage of 1.68 hours, in any given month 7.2 hours and in any given year a total outage time of 3.6 DAYS. Now, I don’t know about you – but 3.6 full days of an outage of a corporate system is pretty large right? Of course it may not happen all at once, and the outages may happen out of hours, so they aren’t an effect on your core business – but they COULD and you’d still be inside of what you signed up for.
Even when you scale this up to the even more impressive sounding 99.9% availability, this still equals 8.76 hours per year – so a full business day of outage per year. This might be fine for your internet service, particularly if you have a redundant connection – but if you’re talking about a cloud based ERP service that’s really not satisfactory given you may actually lose a full day of productivity with it being offline.
A number of pretty large services are only 99.9% available, Office365 (less affectionately known as Office364 by some people) has an SLA of 99.9% available although to be fair they’ve actually significantly outperformed this and are closer to the 99.95% availability which you can check in their trust center – http://trustoffice365.com/. Publishing of real availability metrics is also an important part of establishing trust and Microsoft has done a great job around this aspect of the Office365 product.
So, what am I trying to say? I think it’s important to say that numbers aren’t everything and understanding the real impact an outage would have prior to signing up to services is important particularly with the marketing cloud providers put around their availability metrics. It’s also important to say that it’s fine to have availability metrics, but are all providers actually reporting back to you on what their availability levels are?