Telstra’s latest data outage was one of the worst yet, with approximately 350,000 NBN and ADSL customers affected. Thousands were still struggling to get their internet working three days later. Telstra’s chief operations officer Kate McKenzie has since issued a statement explaining what actually went wrong.
According to the official line from Telstra, lass week’s mass network outage was caused by customers modems’ firmware and a bug in the telco’s network-level DNS. Apparently, a software update to Telstra’s own internal DNS server caused unexpected disruptions to its fixed broadband network. This caused a flow-on affect with customers’ modems with the level of disruption depending on the model of modem they used.
Customers were required in almost all cases to reset their models to factory settings, although some are being sent replacement modems due to continuing connectivity issues. There will be no free data day this time, but affected customers are being offered account credit in compensation.
Here’s the explanation in full as it appears on the Telstra Exchange website. What do you think? Does the explanation stack up? Share your views in the comments.
What happened
Over the past week, we have been working with customers impacted by the NBN and ADSL broadband disruption, which started last Thursday night and has continued to affect some of you during this week.
As I said earlier in the week, it was a complex issue to identify and solve, and there were also some flow-on issues with modems which resulted in different customers being affected in different ways. This has added to the challenge and I know how frustrating this has been for some of you.
I apologise again for the disruption, and in particular for the length of time some of you have not been able to connect. I wanted to provide some more detail about the situation and to explain why some customers were still disconnected after we fixed the network issue on Friday last week.
Every night when most of us are asleep, we make hundreds of changes to our mobile and fixed networks that are designed to both maintain and improve the network’s performance. Last Thursday night was no different. We made 800 changes on our fixed network alone, which is a pretty typical night of updates.
Unfortunately one of those 800 changes – a software update to our Domain Name Servers – caused a number of servers to fail which resulted in a short outage of this function. Normally this would have limited impact however on this occasion it had a cascading effect on our broadband modems and gateways. This occurred when a regular ‘check in’ or ‘heartbeat’ signal used by our modems was not able to contact those servers. When the modems lost their ‘check in’ response it caused many of them to reset, reconnect to the network and attempt to update their settings. These events caused the initial disruption to service throughout Thursday night and into Friday when the network was stabilised.
Unfortunately this process of resetting uncovered an unpredictable response from some of the modems and some of them continued to reset. This ongoing resetting continued for a small percentage of our customers across the weekend and into this week.
Correcting this problem in resetting the modems requires a factory reset for some of them, or a replacement modem for others. Where a replacement modem is required, we are providing it free of charge.
So the issue we’ve faced had nothing to do with a firmware or software upgrade to modems, as has been speculated.
All of these consequences have been unexpected and we are looking into what we can do to prevent them from happening again. Some of the modems did not respond in the way they’re designed to do, and we expect we’ll be able to address that through a software update. We’ll also be addressing the network fault itself.
Importantly, we continue to have confidence in our fixed network. We have invested substantially over many years. We will continue to do so as we bring new features and services into the network and the home including Telstra Air, NBN Voice and HD voice calling, HD Video streaming, Telstra TV and more, making it one of the most advanced networks in the world.
But we understand we have work to do to restore your confidence as well. I can assure you we will not stop until we do.
[Via Telstra]
Comments
7 responses to “Here’s Telstra’s Explanation For That Three-Day Data Outage”
Telstra’s DNS is rubbish it continually fails to resolve simple things, YouTube video DNS resolving is slow and causes slow downs..
I have switched between Telstra, Google and OpenDNS DNS services both Google and Open DNS blow telstra out of the water, obviously these two companies put a lot of effort, time and money into these services.
Something Telstra needs to do, this has been an on going issue for years and years, they won’t invest money nor time and it’s biting them in the digital arse.
they also had the same issue when windows 8 was coming out and after after you changed the DNS to googles it would work fine
Glad now that we don’t use the Telstra provided “modem” on our (FTTP) NBN connection. It’s one of their white lies that users must use the ISP’s hardware to connect. But it’s actually the NBN NTD that makes the connection and service is dictated by the account registered against that NTD. The “modem” is more a router with VOIP capabilities and any compatible device will work. We don’t need a landline so our existing ethernet router does the job with no issues whatsoever.
Nor do we use Telstra’s DNS servers (assuming they aren’t hijacking DNS requests). Android devices tend to use Google’s by default (esp Chromecasts) and our mediaboxes use a proxy.
If you want to keep your landline under Telstra fttp nbn you have to use their modem. It’s the only way they provide the service. They never divulge the sip settings.
But for those who have no need for voice services (which these days is most people), this is irrelevant.
I’ve had too many problems with ISP DNS over the years and now I use OpenDNS via DNSCrypt to completely remove them from the equation.
Telstra and they fckn heartbeat signal. That was always a PITA in the early days of their ADSL till Linksys added support for it on their routers.