(Greetings from MidAmeriCon II in Kansas City.)
IP address 198.144.194.238 appears to be (or have been) the master nameserver for several domains...
zone balug.org zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa zone sf-lug.org
...for which ns1.linuxmafia.com has been doing slave nameservice at your request. Today, I notice I'm getting AXFR problems:
Aug 22 11:15:19 linuxmafia named[11693]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 22 11:15:19 linuxmafia named[11693]: zone balug.org/IN: Transfer started. Aug 22 11:15:40 linuxmafia named[11693]: transfer of 'balug.org/IN' from 198.144.194.238#53: failed to connect: timed out Aug 22 11:15:40 linuxmafia named[11693]: transfer of 'balug.org/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 21.000 secs (0 bytes/sec) Aug 22 11:21:30 linuxmafia named[11693]: zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 22 11:40:34 linuxmafia named[11693]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 22 11:40:34 linuxmafia named[11693]: zone balug.org/IN: Transfer started. Aug 22 11:40:55 linuxmafia named[11693]: transfer of 'balug.org/IN' from 198.144.194.238#53: failed to connect: timed out Aug 22 11:40:55 linuxmafia named[11693]: transfer of 'balug.org/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 20.999 secs (0 bytes/sec) Aug 22 11:45:53 linuxmafia named[11693]: zone sf-lug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0)
I notice that 198.144.194.238 no longer responds to ping.
I notice that auth namservers for the three domains no longer include one that resolves to IP 198.144.194.238.
I notice ns1.linuxmafia.com is still in the authoritative roster for domain balug.org.
I notice ns1.linuxmafia.com is NOT any longer in the authoritative roster for domain sf-lug.org.
Accordingly, I am switching off slave nameservice for the three cited domains at this time. Please advise if you'd like me to resume slave nameservice for any of them, and from what master nameserver IP for each.
Try again, and please reenable.
Looks like host providing 198.144.194.238 got wedged (shown as "up", but was unresponsive). That's been corrected now (swift kick to the reset), so all should be well again.
Thanks - and thanks for catching and noticing the issue (I hadn't quite noticed it myself yet - but certainly would have by this evening, if not sooner).
From: "Rick Moen" rick@linuxmafia.com Subject: AXFR failures from 198.144.194.238 Date: Mon, 22 Aug 2016 12:27:01 -0700
(Greetings from MidAmeriCon II in Kansas City.)
IP address 198.144.194.238 appears to be (or have been) the master nameserver for several domains...
zone balug.org zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa zone sf-lug.org
...for which ns1.linuxmafia.com has been doing slave nameservice at your request. Today, I notice I'm getting AXFR problems:
Aug 22 11:15:19 linuxmafia named[11693]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 22 11:15:19 linuxmafia named[11693]: zone balug.org/IN: Transfer started. Aug 22 11:15:40 linuxmafia named[11693]: transfer of 'balug.org/IN' from 198.144.194.238#53: failed to connect: timed out Aug 22 11:15:40 linuxmafia named[11693]: transfer of 'balug.org/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 21.000 secs (0 bytes/sec) Aug 22 11:21:30 linuxmafia named[11693]: zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 22 11:40:34 linuxmafia named[11693]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 22 11:40:34 linuxmafia named[11693]: zone balug.org/IN: Transfer started. Aug 22 11:40:55 linuxmafia named[11693]: transfer of 'balug.org/IN' from 198.144.194.238#53: failed to connect: timed out Aug 22 11:40:55 linuxmafia named[11693]: transfer of 'balug.org/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 20.999 secs (0 bytes/sec) Aug 22 11:45:53 linuxmafia named[11693]: zone sf-lug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0)
I notice that 198.144.194.238 no longer responds to ping.
I notice that auth namservers for the three domains no longer include one that resolves to IP 198.144.194.238.
I notice ns1.linuxmafia.com is still in the authoritative roster for domain balug.org.
I notice ns1.linuxmafia.com is NOT any longer in the authoritative roster for domain sf-lug.org.
Accordingly, I am switching off slave nameservice for the three cited domains at this time. Please advise if you'd like me to resume slave nameservice for any of them, and from what master nameserver IP for each.
Quoting Michael Paoli (Michael.Paoli@cal.berkeley.edu):
Try again, and please reenable.
Done, and successful.
$ whois balug.org | grep '^Name Server' Name Server: NS1.DREAMHOST.COM Name Server: NS2.DREAMHOST.COM Name Server: NS3.DREAMHOST.COM $ whois sf-lug.org | grep '^Name Server' Name Server: NS2.HE.NET Name Server: NS3.HE.NET Name Server: NS1.LINUXMAFIA.COM Name Server: NS4.HE.NET Name Server: NS.PRIMATE.NET Name Server: NS5.HE.NET $ whois sf-lug.com | grep '^Name Server' Name Server: ns2.he.net Name Server: ns3.he.net Name Server: ns.primate.net Name Server: ns4.he.net Name Server: ns5.he.net Name Server: ns1.linuxmafia.com $
So, is 198.144.194.238 a 'hidden master' for domains balug.org, sf-lug.org, and sf-lug.com (providing AXFR to slave nameservers but not declared publicly authoritative)?
Let me know if so, and I'll annotate that in my /etc/named.conf.local file. The downtime _looked_ like someone had moved nameservice to a new master DNS IP and not advised me as admin of a slave nameserver, because not only was there no AXFR or ping response, but the absence from public WHOIS data seemed suspicious.
Excellent, thanks for the quick turn-around.
And also, more info for your notes (e.g. /etc/named.conf.local comments) further below.
references/excerpts:
From: "Rick Moen" rick@linuxmafia.com Subject: Re: AXFR failures from 198.144.194.238 Date: Mon, 22 Aug 2016 16:05:26 -0700
Quoting Michael Paoli (Michael.Paoli@cal.berkeley.edu):
Try again, and please reenable.
Done, and successful.
So, is 198.144.194.238 a 'hidden master' for domains balug.org, sf-lug.org, and sf-lug.com (providing AXFR to slave nameservers but not declared publicly authoritative)?
Let me know if so, and I'll annotate that in my /etc/named.conf.local file.
Well, yes, 198.144.194.238 is (partially?) "hidden master". I believe it is, however, well listed in the SOA origin, though ... to at least provide some clue(s): $ dig -t SOA sf-lug.org. +short ns1.sf-lug.org. jim.well.com. 1463887991 10800 3600 1209600 3600 $ dig -t SOA sf-lug.com. +short ns1.sf-lug.com. jim.well.com. 1463887991 10800 3600 1209600 10800 $ dig +short ns1.sf-lug.org. A ns1.sf-lug.org. AAAA 198.144.194.238 2001:470:1f04:19e::2 $ dig +short ns1.sf-lug.com. A ns1.sf-lug.com. AAAA 198.144.194.238 2001:470:1f04:19e::2 $ ... of course those IPv6 addreses would've been likely also unresponsive (on same host that was wedged as is also 198.144.194.238)
A reminder on the balug.org situation, as it's wee bit more complex, bit of expert - more fully context further below:
Do note, however, that at present time, balug.org is NOT (yet) Internet delegated to those IPs - I don't expect that to happen until we extricate ourselves from DreamHost.com
Also, prefer if/whenever there might be issue contacting master, that slaves don't drop merely, or quickly on account of just that, and nothing else having changed ... as that semi-defeats one of the purposes of having slaves, and also a fairly long expire time (e.g. if disaster strikes and it takes some fair while to get things in operation again - at least if DNS slaves are still operating, the situation is a bit more clear for those entities trying to figure out what's going on).
Thanks.
More full background on balug.org, from earlier:
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu To: "Rick Moen" rick@linuxmafia.com Cc: BALUG-Admin balug-admin@lists.balug.org Subject: [BALUG-Admin] DNS slaves for BALUG? :-) Date: Fri, 19 Feb 2016 03:16:35 -0800
Rick,
If you could please, and would be willing, could you cover DNS slave services for BALUG, notably these zones: e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa balug.org master(s) (all for each of the above): 198.144.194.238 2001:470:1f04:19e::2 Do note, however, that at present time, balug.org is NOT (yet) Internet delegated to those IPs - I don't expect that to happen until we extricate ourselves from DreamHost.com - however in the meantime it is maintained quite highly alike to the balug.org. DNS data on DreamHost.com - I check periodically, and only differences I'm aware of are SOA MNAME, RNAME, and often the REFRESH (I don't know where they get their REFRESH number from - it seems to vary some fair bit, with no particular discernible pattern) and I tend to keep the serial # one ahead of DreamHost.com (at least most of the time when I check/notice it). If you'd prefer, for balug.org, could also just set up as "warm standby" - verify (to-be) slaves can do AXFR pull, and put most of the configuration in place, but just don't actually activate it until DNS is fully and properly Internet delegated (or we're free from DreamHost.com and about to so delegate).
I'm presuming you could do/offer this on both ns1.linuxmafia.com. and also ns1.svlug.org.? That would be great, if you're able to.
I'm also presuming the various IP information and out-of-band communication information is still the same as when we set up slaves for sf-lug.org (plus any relevant updates received since then).
Just let me know, thanks (can also email just me directly for any bits that ought not get publicly archived, etc.).
Quoting Michael Paoli (Michael.Paoli@cal.berkeley.edu):
Well, yes, 198.144.194.238 is (partially?) "hidden master". I believe it is, however, well listed in the SOA origin, though ... to at least provide some clue(s):
Yes, I meant to check the SOA record in the auth nameservice, but ran out of time because I was waiting for a 'plane back from Kansas City. (I've arrived home, now.)
Anyway, I've annotated my named.conf.local so I'm not taken by surprise again by a master nameserver that's not in the authoritative roster and doesn't respond to ping.
Also, prefer if/whenever there might be issue contacting master, that slaves don't drop merely, or quickly on account of just that, and nothing else having changed ... as that semi-defeats one of the purposes of having slaves, and also a fairly long expire time (e.g. if disaster strikes and it takes some fair while to get things in operation again - at least if DNS slaves are still operating, the situation is a bit more clear for those entities trying to figure out what's going on).
Yeah, in this case, it was an artifact of the situation looking (from available data) _very_ much like past situations where someone moved the master DNS and failed to notify the slave nameserver admins. Which has happened to me repeatedly. And, just a point: Every time _that_ situation has gotten visited upon me as a volunteer provider of secondary nameservice, I've not only been put to some avoidable and unnecessary work, but also (for lack of a heads-up) my own nameservice has continued to publish wrong, obsolete zone information for long periods until I noticed the communication failure. E.g., I and my users (in those cases) got left using obsolete DNS data. _So_, I've become quick to pull the trigger when something just doesn't look right at all.
For my part, I would strongly suggest that, when you get people to do slave nameservice for you, you really ought to warn those systems' admins if the master nameserver will neither respond to ping nor (especially) be deliberately omitted from the authoritative roster. Why? Because otherwise there's an excellent chance they'll interpret those data during the first downtime episode as meaning that you moved the master and failed to tell them. I did.