On Friday a large chunk of the internet went off the air when Cloudflare apparently fat-fingered a routing update and sent all of their global traffic to a single POP, vaporizing it almost instantly.
This affected their DNS service, and of course, as everybody knows, when your DNS is gone, so are you. At least one other commercial DNS provider who uses Cloudflare in front of their own nameservers for DDoS mitigation also went off the air.
We’re familiar with Cloudflare’s DDoS service for DNS providers, because we use it ourselves. Fortunately easyDNS was not impacted by the outage (I didn’t even notice it, tbh), and I only heard about it later in the day when I checked in on social media at some point and saw all this chatter about “half the internet blowing up”.
EasyDNS was unaffected because while we do use Cloudflare to soak up large DDoS attacks against our nameservers, we don’t use them across all of our nameservers. I think somewhere in my book I wrote “DNS providers have a near-pathological aversion to SPOFs” (Single Point of Failures). Maybe only we do.
This is why whenever one of the largest DNS providers in the world blows themselves up, or gets DDoSed off of the air we are quick to point out two things:
- This is inevitable and unavoidable and entirely excusable. Everybody blows up, every DNS provider in existence will experience downtime. No exceptions.
- There is a silver bullet for avoiding your own downtime when your DNS provider blows up and it is to use multiple DNS providers (a point we’ve belaboured many times in the past is that every DNS provider is a logical SPoF unto itself).
At easyDNS we experienced so much pain from this reality that we created a system to automate flipping DNS providers at the first sign of trouble.
We call it Proactive Nameservers, and we’re the only company in the world doing it for some reason. Maybe this is because in order to provide a service like nameserver failover, it means a company has to admit to its customers the reality that their own nameservers may at some point, fail.
The two approaches to multi-DNS architectures are active/active: use multiple DNS providers all the time, or active/passive, which is what Proactive DNS does.
For active/active there are myriad ways to do it, you can use things like our easyRoute53 integration into Amazon Route 53 DNS, so you only need to manage your DNS settings in one place, or just use plain-old-fashioned secondary DNS at some out-of-band provider. Tools like OctoDNS can help you automate across multiple providers (on that note, easyDNS support for OctoDNS is either out now or in the process of being committed).
See our High Availability DNS page for more info on integrations and lo-fi methods of doing this.
Again, from my book, even a single unicast node staying up when all else is down will get you through a major network event like this unscathed.
But if you want to use a preferred DNS provider, such as Cloudflare, who use their DNS responses to optimize your website proxy. That works best most of the time, so then you want to go with an active/passive model that will step back when things are going according to plan, and then when these periodic network cataclysms do occur (and they will), they step into the breach and update your nameservers so that you at least stay up until the crisis is over.
The only requirement to use Proactive Nameservers is that we have to be your registrar, because we need to connect to the registry to update your nameserver delegation. If for some unfathomable reason we aren’t your preferred DNS vendor, you can stick with one who is and just transfer your domain here. (we even have a transfer valet to do all the heavy lifting for you if you need it).
Learn more here (including pricing) or check out the original Proactive DNS explainer video….