RIM must learn from service outage

It looks like the BlackBerry service outage is now behind us. The incident couldn’t have come at a worse time for RIM, following some harsh criticism in recent months as a result of its recent financial performance, product delays, and the disappointment of its partners – chief among them the operators.
RIM said the service disruption was due to a failure of a core hardware switch related to its BlackBerry Internet Servers (BIS) - it would appear that the Blackberry Enterprise Server (BES) was not affected.
Not only did the switch fail but the failover system that is supposed to back-up to another switch and recover the system was not responding properly. If the problem really was related to a hardware failure, then service disruption should not strike again and the problem should now be contained once and for all. However, if the problem is deeper than this and provoked by unknown software glitches, then a full recovery of the system could take days or even weeks, particularly if the system was exposed to malware or a virus attack.
Another origin of the problem could be related to a system overload resulting from the increased numbers of BlackBerry users together with the implementation of new features and services such as media-content sharing, BlackBerry Messenger (BBM) music download and online interactive gaming. Not only do these services generate a huge amount of data that cross the BIS, but also expose the system to malware attacks. If this is the case, RIM will need scale up its infrastructure considerably to cope adequately with the traffic crossing BIS, and this could demand significant investment and time to implement.
In the meantime, to alleviate the traffic burden, the company might be forced to switch off some features and services that generate a lot of data. If this is really what is happening, the company could seriously damage its credibility further with its partners – and they would question their reliance on RIM’s roadmap.
The situation could have been much worse if the BES had been affected because this server deals with enterprise services that contain time-sensitive information, including e-mail. Loss of this content would have seriously exposed RIM to legal liabilities and would have pushed enterprise customers to look for alternatives immediately. Having said that, a number of enterprise customers do rely on BIS for e-mail services. In fact, RIM recently launched the BES “Lite” version which allows enterprises to download the BES for free and enables their employees to access the service via BIS. So these employees now have the same functionality from BES, but accessing via a BIS plan.
Communication with customers
RIM has not communicated very well on this front, which has done it a lot of harm. Google or Apple would have handled a similar problem much more sensibly. For example, when Apple had problems with the antenna system in the iPhone 4, it communicated the problem really well and managed to calm the resulting media furor in just one day. This was mostly thanks to the intervention of Steve Jobs, the CEO of Apple at the time, as he communicated the problem to the industry and promised to compensate those who had been affected or provide a solution, in the shape of the “bumper”, free of charge.
In contrast, RIM has shown hesitant leadership; it communicated the issue very badly and the joint chief executives, Mike Lazaridis and Jim Balsillie apologized to customers only under pressure, and only on the third day of the crisis. True, both looked humble and sorry for what had happened. However, they struggled to identify the main causes of the outage and did not say what plans they are putting in place to compensate customers. At this stage, RIM has given the impression it is struggling to contain the problem and, worse still, is probably not even aware of the main reasons behind the service outage.
Impact on consumers
Since BIS is the main service affected, consumers were the most exposed to the BlackBerry service outage. As these customers do not usually deal with sensitive data, it will take more than just a couple of days of bad user experience to persuade them to look for alternatives. Consumers are often exposed to a bad data experience owing to poor cellular coverage or a shortage of mobile network capacity, so they will perceive the current BlackBerry incident as just another failure of the mobile system. Having said that, if the problem repeats itself it could be disastrous for RIM because customers would start abandoning BlackBerry.
Although the BES was not directly affected, some businesses may see this as a good reason to re-evaluate their reliance on centralized servers and instead look to invest in more corporately-controlled servers. Not only would this enable IT departments to minimize the risk of unforeseen collapses but it could also give employees more flexibility to use their own devices.
The cost of compensation
If one considers the fees customers pay for connecting to BlackBerry services, an average of $5 (€3.65) per month and, if we assume that all 70 million Blackberry subscribers should be compensated for the loss of service they have suffered, then the total amount RIM should pay to refund its customers would be about $12 million per day. This amount does not take into account liability fees for loss of data or any related legal issues. This could mean RIM paying out over $100 million.
A number of mobile operators in the Middle East and Europe are already taking the hit and are putting plans in place to compensate BlackBerry customers affected by the outage. Surely these operators will ask RIM for a refund for damages incurred during the period of the service outage.
What about the BlackBerry brand?
Although the company makes almost 80% of its revenues from selling devices, its brand relies on the services it offers, including secure and efficient e-mail and messaging services, plus fast and data-optimized Internet browsing services.
These qualities are enabled thanks to its Network Operating Center (NOC) including the BIS and BES. A number of users are hooked on BlackBerry devices because they enjoy using the services rather than being specifically attracted to the BlackBerry phones. If the recent problems are not resolved once and for all, consumers and business customers will consider looking for alternatives.
There are many lessons RIM should take from this experience but the three of the most crucial are:
• Decentralize equipment and services and build more reliable backups.
• Moderate its ambitions and focus on specific market segments and services in line with its core expertise and brand.
• RIM should review the way it communicates with the industry and should put contingency plans in place to address potential issues during times of crisis.

Original article: BlackBerry outage: what went wrong and what lessons should RIM learn from this incident?