Thought you'd be amused at this cautionary tale, Michael P.
----- Forwarded message from Rick Moen rick@linuxmafia.com -----
Date: Mon, 16 Oct 2017 10:05:29 -0700 From: Rick Moen rick@linuxmafia.com To: Duncan MacKinnon duncan1@gmail.com Subject: Rest of that story about secondary DNS Organization: If you lived here, you'd be $HOME already.
I _think_ I started to tell you a story about doing secondary DNS for people, and something I learned. Of course, the standard model is supposed to be: You do auth DNS for my domains; I do it for yours.
Years ago, I started to see the flaw in that optimistic, we-help-each-other mental model. There was a user group in Santa Cruz, SMAUG, which owned domain 'scruz.net'. (Terrible naming and choice of domain; not my doing, not my call.) I ended up being primary/master DNS, and we were doing really well because we signed up five more individuals with auth nameservers to help out with secondary/slave DNS, for six auth nameservers total, widely dispersed geographically and with a lot of geographic diversity. That's fabulous redundancy. What could possibly go wrong? </deadpan>
I relaxed about quality of service, because obviously we were way ahead of the game. (SMAUG had a mailing list on the SVLUG mailing list server. The mailing list still exists, derelict, the group having now fallen apart.) Roll forward to one day when my household uplink through Raw Bandwidth Communications was offline for about an hour because SBC/AT&T shot the company in the foot.
When my aDSL came back, I found postings to SMAUG's mailing list bitching about the scruz.net nameservice having been totally offline. I noticed that some of the complaining came from the five individuals who were allegedly doing secondary/slave nameservice. Hmm?
So, I checked on the five secondaries. Certainly, my aDSL being offline for an hour should not have taken all DNS offline. And what I found was: Over about a two-year period, some of the five had moved their nameservers to new IPs and failed to notify me as master nameserver admin. Some had ceased doing auth DNS entirely, and failed to notify me as master nameserver admin. Some still had the same nameserver running at the same IP as always, but had quietly ceased doing auth namservice for scruz.net, and failed to notify me as master nameserver admin.
All of the nameserver IPs they'd provided me for their secondary nameservice were still listed in the whois (and as NS lines for the domain in the parent .net zone). But exactly one nameserver still existed and was actually _doing_ auth DNS for scruz.net -- mine. All five of the others had silently flaked out. Which made it extra galling that some of these guys complained about -my- nameservice being unreliable, since theirs was 100% unreliable, their having broken it in various ways, whereas mine worked great except once in a blue moon when my uplink went down.
I thought: OK, obviously it turns out to be a mistake to just trust that secondaries will continue to exist and that their operators will do due-diligence communication with the primary when something important changes. They _should_, but it turns out they don't.
So, I wrote a weekly cron script to check on all the secondaries for my two domains, linuxmafia.com and unixmercenary.net: It queries and reports the parent-zone NS "glue" records, queries and reports the nameservers declared authoritative in whois, and reports each auth nameserver's zonefile S/N so I can make sure they all respond and give the same value. This means I can detect and act on flaky secondaries.
What I did _not_ do was bother writing a script to check on other people's master nameservers, on domains for which *I* do secondary nameservice. Failures in this case are almost entirely the domain owner's problem, not mine. As long as I keep _my_ word for quality of secondary DNS, I'm OK.
Well, almost. I do secondary for five or six domains Ruben Safir owns, and recently double-checked those. For most of them, I advised Ruben _again_ that having only two auth nameserver isn't enough and is dangerously thin. I urged him to find a couple more, somewhere.
For one of them, nylxs.com, I noticed and advised Ruben that _neither_ his nor my auth nameservers were authoritative any more. Instead of WWW2.MRBRKLYN.COM NS1.LINUXMAFIA.COM the records now listed the auth nameservers like so: $ whois nylxs.com | grep 'Name Server' Name Server: NS69.DOMAINCONTROL.COM Name Server: NS70.DOMAINCONTROL.COM $
I wrote Ruben: 'I find that you have _ceased_ using my secondary (slave) nameservice, but neglected to inform me. That's rude, Ruben. You need to friggin' tell your secondaries if/when you move auth nameservice somewhere else. Grr. Learn to do it right, already!'
Turns out, that's not what happened, exactly.
Ruben had failed to pay his domain renewal, so his registrar (GoDaddy) had repointed its DNS to 'parked domain' nameservice from its own domaincontrol.com nameservers, making those authoritative in place of Ruben's and mine.
Luckily, because I (in effect) warned Ruben of his expired domain in time, he was able to renew it.
So, lesson: When you find yourself annoyingly still doing futile secondary DNS for a domain whose owner _seems_ to have moved auth nameservice elsewhere without telling you, the explanation isn't always owner lack of diligence concerning communicating with secondaries: Sometimes, it's merely owner lack of diligence in paying the bill.
----- End forwarded message -----