DNS Anomaly on Saturday - transient lookup issues [resolved]

On Saturday, July 9, between 5:30pm EST and 6:30pm EST (21:30 – 22:30 UTC) we had a spike in users reporting response issues on some domains. Service degraded for the better part of an hour for some domains or (more likely) some parts of the world. As much as we hate to use the “regional outages” line, it’s looking like it.

Most of the effected domains were either: on the legacy system using the legacy nameserver delegation, or on the new system but had not yet switched their nameserver delegation to add the additional anycast constellation on the new platform.

I say most, because we have one report from a user who is fully on the new system who experienced issues.

We have had no reports from our Enterprise DNS users (if you are one and you had issues, please do let us know).

We initially believed this to be a general network issue (because we saw what looked like a corresponding spike in “* is down” reports on twitter, etc for unrelated domains around the same time). But as more data comes in, we are going to suck it up and say it: something weird happened, we think it was here.

Our response:

1) We are still gathering data and we have identified ways to enhance our own internal monitoring capabilities so that we can cross reference what we see with what external monitoring applications are seeing.

2) We recently made a change in the way we adjust our BGP announcements for individual members of the anycast constellations to optimize response times. Since this has never happened before, and we enacted this change recently, we are suspicious that this may be at the core of the problem and we have rolled that back.

3) We are laggard in our announcement here, and for that I personally apologize. When the event occurred many of the systems group conferred about the incident on Saturday night and we suspected a wider network issue, but we still made the decision to rollback the new BGP programs just to be on the safe side. It was only after we saw more reports roll in today from users that we had to rethink our stance around this and accept that this was most likely our problem and not a network flap or outage.

We’re sincerely sorry to all that were affected and for the delay in this posting.

Comments

Olivier Dagenais says

July 11, 2011 at 9:45 pm

Hi,

I recently switched to the new system, but I don’t know if I added the additional anycast constellation. Can you let us know (1) how we tell if we have it and (2) how to add it if we don’t?

Thanks!

– Oli

Mark Jeftovic says

July 11, 2011 at 10:43 pm

Take a look at your nameserver delegation either in the “nameservers” section of your Domain Overview (under the “Domain Info” tab) or look at your whois record.

If you see something like this:

ns1.easydns.com

ns2.easydns.com

ns3.easydns.org

ns6.easydns.net

remote1.easydns.com

remote2.easydns.com

you’re still using the legacy nameservers.

If we are your registrar, you can go into that aforementioned nameservers link and just click on the “Use easyDNS Nameservers” link, and it will load the form with the relevant nameservers.

You can see which nameservers apply to which level of service here:

https://web.easydns.com/our_nameservers.php

Rule of thumb:

All of the nameservers under the new system are named “dns1”, “dns2”, “dns3” and “dns4”

The legacy system are named “ns1”, “ns2” or “remote1”, etc.

DNS Anomaly on Saturday – transient lookup issues [resolved]

easyDNS Technologies, Inc. Founded In 1998

Products & Services

Support/Help

Resources & Tools

About easyDNS

Contact Information

How can we help?

Tell us more

Reader Interactions