Microsoft and AWS hit by Christmas cloud outages
Several cloud service providers were forced offline over Christmas, much to the dismay of users
While many of us have been enjoying some much needed downtime over the Christmas period, several cloud providers were forced to act quickly after their services went offline, leaving (one assumes) users a little less full of festive cheer.
Here, we take a look at what happened and how it was sorted.
Christmas Eve and Christmas Day should have been the busiest period of the year for online film service Netflix. However, when developers at Amazon Web Services (AWS) accidentally deleted data critical to Netflix’s running, customers in the Americas were left searching for other forms of entertainment over the festive period.
The disruption started at lunchtime on Christmas Eve, when a portion of AWS’ Elastic Load Balancing Service (ELB) state data was deleted. While the greatest part of the issue was resolved by 8.15am on Christmas Day, it took close to 24 hours for the service to fully return to normal.
Millions of people will have received a new computer game for Christmas. Sadly for Xbox 360 users, though, the thrill of playing the latest release was put on hold, after Microsoft’s Cloud Save feature broke down on 28 December.
The outage continued for the whole weekend, with users unable to access saved games held in the cloud until 31 December. Streaming services such as Netflix and HBO Go were also affected for some users.
By way of apology, Xbox Live has said it will be automatically applying a one-month extension to the Gold membership of all those who were affected.
Alex Garden, Xbox Live general manager, said in a blog post: “It took longer than we expected to get back to full performance as we needed to ensure the integrity of everyone’s game saves.
“Whether you couldn’t access your game saves for a couple of hours or a couple of days, we sincerely apologise for the delay and inconvenience.”
Garden also said his team would be doing a “thorough post mortem” to avoid a recurrence of the issue.
It was not just gamers who were affected by Microsoft’s technical hitch; the company’s Azure service was also disrupted between 28 and 30 December.
Microsoft initially reported that only users of its storage service in the South Central US region were affected. However, it quickly became apparent the outage was also affecting its global Management Portal.
The problem, which was blamed on ‘faulty nodes’ took over 36 hours to resolve in full, with Microsoft issuing an apology “for the interruption and issues it has caused our customers”.