
Azure/Microsoft
Office 365 suffered an outage on Tuesday, apparently caused by a heating issue. It largely affects Southern states and Texas, according to Down Detector tracking maps.
The problems began
around 7:45 ET this morning, with complaints to Down Detector peaking around 10 a.m. They affect access to administrative services and to Outlook email for some users.
"Admin still down for us
in DFW," writes one.
"Still down in Jacksonville, FL," writes another.
Reports also came in from Kansas City.
However, some locations reported that service
had been restored.
"FINALLY got some Admin portal functionality. Azure portal has been working fine for last couple hours, AFAIK," another user writes.
According to media reports,
Microsoft issued this bulletin to users:
"Automated data center procedures to ensure data and hardware integrity went into effect when temperatures hit a specified threshold and critical
hardware entered a structured power down process. The impact to the cooling system has been isolated and is in the process of being mitigated. Engineers are continuing to work towards restoration of
services. The next update will be provided at 14:00 UTC or as events warrant."
One complainant wrote that Microsoft had issued a bulletin to say "they know there is an issue now and they will
keep letting me know once further updates are posted to the service bulletin. The bulletin mentions a data center infrastructure issue, but the heat-map above is interesting because this makes it seem
like the issue is more widespread than problems with a single data center would create."
The incident has also drawn criticism from at least one email security vendor.
"Today’s
incident at Azure was another clear reminder for the need for organizations to build in their own redundancy rather than rely on a single vendor," says Pete Banham, cyber resilience expert at
Mimecast.
He adds that "all organizations, including Microsoft, need to consider what downstream effects there may be from losing a critical service due to technical failure or human
error."
Banham concludes: "Should employees around the world using Office 365 be reliant on a single Azure DC in the US? Services will always fail and IT leaders need to ensure they have not
outsourced responsibility to a lone cloud service."