On Monday T-Mobile’s voice and text messaging services were down all evening, with the outage stretching for over twelve hours. Now, its President of Technology Neville Ray has given some explanations of what happened and what the company says it’s doing to keep it from happening again.
I want to be fully transparent about what happened yesterday with our network. We did not meet our own bar for excellence. We have taken the necessary steps to avoid reoccurrence and truly apologize for any inconvenience we created. https://t.co/sDXZemXRsK
— Neville (@NevilleRay) June 17, 2020
Contrary to reports from some Twitter accounts or trending hashtags, the company didn’t cite any DDoS attack or other nefarious behavior as a reason for the problem. Specifically, a fiber circuit owned by another provider somewhere in the southeastern US failed, and their redundant features that were supposed to help manage the situation instead created a traffic storm of their own that overwhelmed the capacity of their network that handles Voice-over-LTE (VoLTE) calls.
As Cloudflare CEO Matthew Prince pointed out that day, internet exchanges didn’t show the increase in traffic that would’ve suggested an attack under way, revealing the “boring” explanation of what happened. Throw in DownDetector highlighting reports from highly-populated areas where T-Mobile customers live and reported the outage, along with customers for other carriers who couldn’t get through to people on T-Mobile, and you get the storm of misinformation and confusion that surrounded the outage.
The questions that remain to be answered are whether T-Mobile will do anything for the customers left without service for such a long period, and whether or not these answers will satisfy the FCC’s investigation.