Facebook said on March 14 that it had fixed the issues that left users without service and advertisers without a platform to reach them for about 20 hours. The company blamed a server configuration change for the longest blackout ever.
Lee McKnight, an associate professor at Syracuse University School of Information Studies, believes the massive Facebook outage -- which also affected Instagram, Messenger and WhatsApp -- and says that while the general public will never know the exact cause of the outage, he believes a code error is the “server issue” ran across large portions of Facebook’s network, causing the outage.
“Whether it is fair to blame on ‘human error’ or a failure of an
automated quality assurance test program, maybe we will never know,” McKnight wrote in an email to MediaPost.
The near day-long outage that stifled Facebook’s advertising network also raises questions about its public obligation to report specific reasons for the outage to regulators, similar to the requirements of telecommunications companies or other service providers.
Facebook, however, is not a utility company. The long-running debate over whether the government should regulate the internet and whether providers should be treated as a utility doesn't apply. Nor does it take into account the companies that use the internet to operate.
Today, there are no regulatory obligations for Facebook to explain itself.
“We may or may not ever know the precise cause of the epic 22-hour outage,” he wrote in an email to MediaPost.
“The fact that a network of billions of users could seize up because of a typo in a line of code, other random human or machine error, or a malicious actor intending to disrupt Facebook's servers should not surprise us,” he wrote.
Facebook could have been trying to upgrade the software on some of its servers, which would involve taking them offline, installing the software code patch or version update, and then restarting the servers.
McKnight says it’s not uncommon for human error or undetected bad code in the program that runs Facebook to creep in during the reboot process, which is why
organizations schedule updates for off hours.
For a global company like Facebook, there is never a “good” time. He admits that operational servers can cause a cascading set of errors and buffer overloads to occur across the network, making matters worse.