General Discussion
Related: Editorials & Other Articles, Issue Forums, Alliance Forums, Region ForumsFacebook rendered spineless by buggy audit code that missed catastrophic network config error
In a write-up by infrastructure veep Santosh Janardhan, titled "More details about the October 4 outage," the outrage-monetization giant confirmed early analyses that Facebook yesterday withdrew the border gateway protocol (BGP) routing to its own DNS servers, causing its domain names to fail to resolve.
That led to its websites disappearing, apps stopping, and internal tools and services needed by staff to remedy the situation breaking down as well.
But this DNS and BGP borkage turns out to have been the consequence of other errors. Janardhan explained that it operates two classes of data center.
One type was described as "massive buildings that house millions of machines," performing core computation and storage tasks. The other bit barns are "smaller facilities that connect our backbone network to the broader internet and the people using our platforms."
Users of Facebook's services first touch one of those smaller facilities, which then send traffic over Facebook's backbone to a larger data center. Like any complex system, that backbone is not set-and-forget it requires maintenance. Facebook stuffed that up.
"During one of these routine maintenance jobs, a command was issued with the intention to assess the availability of global backbone capacity, which unintentionally took down all the connections in our backbone network," Janardhan revealed.
https://www.theregister.com/2021/10/06/facebook_outage_explained_in_detail/
lagomorph777
(30,613 posts)lapfog_1
(29,166 posts)"It was not possible to access our data centers through our normal means because their networks were down, and
the total loss of DNS broke many of the internal tools we'd normally use to investigate and resolve outages like this," Janardhan stated.
Idiots.
I design and build large data centers for the likes of NASA and now a major chip maker.
I ALWAYS design in a "backdoor" that depends on NOTHING being available other than standard TCP/IP routers and switches. No DNS, no BGP. Nothing fancy. And I build automation to restore services just in case something like this happens.
Sigh.
Sapient Donkey
(1,568 posts)lagomorph777
(30,613 posts)He gave us a few hours less of the hate machine known as Facebook.
lapfog_1
(29,166 posts)I have a facebook account, and a twitter account, and a snapchat, myspace, and god knows what else.
some old friends do post things to facebook that I get notifications on.
I NEVER post to social media. nope, not ever... if my old friends post something that I feel motivated to comment on, I send them an email.
DU is the only social media I ever post to or visit to read.
I have the accounts because I don't want anyone impersonating me online.
But I am their worst user. They get nothing from me for their advertisements. Unfortunately, the pandemic has forced me to use Instacart and Amazon to order things... so now many websites are littered with ads for things I already purchased.
And I've been using the "Internet" since 1977 (back when it was Arpanet and the complete "map" showed around 50 or 60 systems) plus I wrote some of the translating code for the BBN IMPs.
lagomorph777
(30,613 posts)It was very high-tech. I downloaded the code onto a paper tape, which I then read into the "ladder" controller.
lapfog_1
(29,166 posts)or teletype with the paper tape reel
I used both... started out with IBM 026 punch cards writing Fortran in 1970
lagomorph777
(30,613 posts)But looking back, I'm glad my school afforded me the "old school" experience.
The paper tape I used was om a spool; I am pretty sure was the same machine as you photo (Teletype?)
Klaralven
(7,510 posts)https://datacenterfrontier.com/facebook-showcases-its-40-million-square-feet-of-global-data-centers/