[BALUG-Admin] DNS slaves for BALUG? :-) ... IPv6 issue somewhere between master and slaves?

Sat Feb 20 22:12:02 PST 2016

So ... wee bit of reference introduction,
then my tests/findings/recommendations/etc.,
then more of the earlier referenced bits.

> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
> Subject: Re: [BALUG-Admin] DNS slaves for BALUG?  :-) ... IPv6 issue  
> somewhere between master and slaves?
> Date: Sat, 20 Feb 2016 18:52:14 -0800

> No problem, I'll take a look, and let you know what I find.
>
> Quite likely it's lack of IPv6 or some other IPv6 issue
> on slave end(s), but I'll do some further checking to see if I can
> isolate it to that (at least for ns1.linuxmafia.com. - I'm presuming
> same/similar for ns1.svlug.org.), and see whether or not
> zone can be pulled via IPv6 - and beyond the IPv6 addresses
> where I've done so on my home networks (all of which happen to
> be via same ISP for IPv6 - so would be good to check from some
> other(s) - I think I may have a location or two (or three or more?) that
> I may be able to check that from).
>
> Thanks, I'll let you know.

So ... most of the testing details first.
Then past that section,
recommendations for the very specific slaves in
question - based upon those test results.
Then there's bit more information/comments after that.
And then a wee bit more additional testing.

$ dig +short ns1.linuxmafia.com. A ns1.linuxmafia.com. AAAA
198.144.195.186
$ ssh -ax linuxmafia.com. 'umask 077 && hostname &&
> PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:"$PATH" \
> && { { >>/dev/null 2>&1 type ip && ip -4 a s; ip -6 a s; ip -6 r s; \
> } || { ifconfig -a; netstat -nr; } }'
linuxmafia.com
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
     inet 127.0.0.1/8 scope host lo
3: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast  
state UP qlen 1000
     inet 198.144.195.186/29 brd 198.144.195.191 scope global eth2
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436
     inet6 ::1/128 scope host
        valid_lft forever preferred_lft forever
3: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
     inet6 fe80::220:edff:fe13:ba89/64 scope link
        valid_lft forever preferred_lft forever
fe80::/64 dev eth2  proto kernel  metric 256  mtu 1500 advmss 1440 hoplimit 0
$

So ... [ns1.]linuxmafia.com. has an IPv6 stack and IPv6 link-local
apparently configured, but no IPv6 route for anything beyond link-local
and localhost.
Let's see if I can confirm that ...

$ ssh -ax linuxmafia.com. 'umask 077 && hostname && \
> PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:"$PATH" \
> && ping6 -n -c 5 2001:470:1f04:19e::2'
linuxmafia.com
connect: Network is unreachable
$

Yes, can't get to 2001:470:1f04:19e::2 from
[ns1.]linuxmafia.com.
198.144.195.186
::1/128
fe80::/64
without some type of route to 2001:470:1f04:19e::2 from that host.
the:
connect: Network is unreachable
probably indicates we're getting an ICMP network unreachable,
probably from the host itself, as it has no available route to get to
2001:470:1f04:19e::2
The diagnostic bit(s):
failure trying master 2001:470:1f04:19e::2#53 (source ::#0): operation  
canceled
would be, I'd guess, the nameserver's software's way of indicating it
can't get to there or is otherwise failing - the "(source ::#0)" may be
some kind of hint as to source IP that it's trying from or that it's
trying that from IPv6, and knows not how to get there, or fails from
that source IP, or IPv6 in general.
Let me see if I can pull the zone via IPv6 to elsewhere ...

$ ip -6 a s | fgrep inet6; ip -6 r s | fgrep via
     inet6 ::1/128 scope host
     inet6 fe80::221a:6ff:fe03:c207/64 scope link
     inet6 2001:470:66:76f::2/64 scope global
     inet6 fe80::c690:c2eb/64 scope link
     inet6 fe80::fc54:ff:fe13:5199/64 scope link
default via 2001:470:66:76f::1 dev he-ipv6  metric 1024
$ dig @2001:470:1f04:19e::2 -t AXFR \
> e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. +norecurse +noall +nocmd \
> +nocomments +answer | grep '^[        ]*[^    ;]'
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN SOA ns1.balug.org.  
postmaster.balug.org. 1456011000 10800 3600 1209600 3600
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN NS ns0.mpaoli.net.
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN NS ns1.balug.org.
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN NS ns1.svlug.org.
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN NS ns1.linuxmafia.com.
2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200  
IN PTR balug.org.
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN SOA ns1.balug.org.  
postmaster.balug.org. 1456011000 10800 3600 1209600 3600
$

Yes, I'm able to pull the zone (master is specifically configured:
allow-transfer { any; };
for at least the [L]UG zone(s) - and that includes
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa.
but the client IPv6 I pulled from is from same ISP/provider as the
server pulled from.  Do I have another IPv6 independent of that ISP
that I can check from?

$ ssh -ax m-net.arbornet.org. hostname
ssh: connect to host m-net.arbornet.org. port 22: No route to host
$
Hmmmm...
$ (cd ~/tmp && stat -c '%y %n' .m-net.arbornet.org* && cat
> .m-net.arbornet.org.lastlogin)
2016-02-10 00:05:03.690977780 -0800 .m-net.arbornet.org.lastlogin
michaelp         pts/7    Feb 10 03:05 (198.144.194.235)
Connection to m-net.arbornet.org. closed.
$
I've got a crontab job:

$ crontab -l | fgrep m-net.arbornet.org | grep -v '^#'
5 8 * * * TZ=PST8PDT export TZ; set -e; trap 'trap - 0; exit 0' 0;  
exec </dev/null >>/dev/null 2>&1; cd tmp;  
lastlogin=.m-net.arbornet.org.lastlogin  
lastoutput=.m-net.arbornet.org.lastoutput; set +e; tmp=`find  
"$lastlogin" -prune ! -mtime +15 -type f -print`; [ x"$tmp" =  
x"$lastlogin" ] || { </dev/null >"$lastoutput" 2>&1 ssh -a -n -t -t -x  
-o BatchMode=yes m-net.arbornet.org. 'umask 077 && exec who am I' &&  
mv -f "$lastoutput" "$lastlogin"; }
$

that keeps my account there from going away (I think it has to be logged
into once every 30 days to not get removed).  My crontab job, daily,
makes a login attempt if its last successful login was more than 15
days ago.
Looks like it worked rather recently ... but not working at the moment.

So, another ISP's IPv6 ...
ignore some of the despicable operating system bits below ...
but at least with Cygwin it's almost bearable ... testing from a
work connection (hey, fair's fair, often use home connections from work
to do work to, e.g. test various accessibility from The Internet, etc.)
...

$ IPCONFIG /ALL | fgrep IPv6\ Address | fgrep -v Link-local
    IPv6 Address. . . . . . . . . . . :  
2600:1010:b161:5e74:7c7a:2cff:fe94:2fa1(Preferred)
$ dig @2001:470:1f04:19e::2 -t AXFR \
> e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. +norecurse +noall +nocmd \
> +nocomments +answer | grep '^[        ]*[^    ;]'
/usr/src/ports/bind/bind-9.10.2-2.P2.x86_64/src/bind-9.10.2-P2/lib/isc/unix/socket.c:2867: setsockopt(20, IPV6_RECVTCLASS) failed: Invalid  
argument
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN SOA ns1.balug.org.  
postmaster.balug.org. 1456011000 10800 3600 1209600 3600
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN NS ns0.mpaoli.net.
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN NS ns1.balug.org.
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN NS ns1.svlug.org.
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN NS ns1.linuxmafia.com.
2.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200  
IN PTR balug.org.
e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 7200 IN SOA ns1.balug.org.  
postmaster.balug.org. 1456011000 10800 3600 1209600 3600
$

Okay, enough of that, ... set *that* operating system aside ...
now back to a *real* operating system :-> ...
And to look at ISPs to see/show they're different, at least to the
extent can check with whois(1), we have ...
First the master:

$ 2>&1 whois -H 2001:470:1f04:19e::2 |
> egrep -i '^(netname|nethandle|netrange|cidr|organization)'
NetRange:       2001:470:: - 2001:470:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF
CIDR:           2001:470::/32
NetName:        HURRICANE-IPV6
NetHandle:      NET6-2001-470-1
Organization:   Hurricane Electric, Inc. (HURC)
$

Then the first global routable IPv6 I tried from (expecting same ISP in
whois):

$ 2>&1 whois -H 2001:470:66:76f::2 |
> egrep -i '^(netname|nethandle|netrange|cidr|organization)'
NetRange:       2001:470:: - 2001:470:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF
CIDR:           2001:470::/32
NetName:        HURRICANE-IPV6
NetHandle:      NET6-2001-470-1
Organization:   Hurricane Electric, Inc. (HURC)
$

No surprises there - it's also part of same NetRange / CIDR, as we can
see in even the first set of whois data.
And last client I tried, I'm hoping/expecting different ISP (is a
big telecom company after all) ...

$ 2>&1 whois -H 2600:1010:b161:5e74:7c7a:2cff:fe94:2fa1 |
> egrep -i '^(netname|nethandle|netrange|cidr|organization)'
NetRange:       2600:1000:: - 2600:1017:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF
CIDR:           2600:1010::/29, 2600:1000::/28
NetName:        WIRELESSDATANETWORK
NetHandle:      NET6-2600-1000-1
Organization:   Cellco Partnership DBA Verizon Wireless (CLLC)
$

And yes, an ISP, that as far as I'm aware, is quite independent of the
other ISP shown for the two other IPv6 addresses.

So ... recommendations ...
I'm not sure what DNS slave server software is in use
but for the masters, if the IPv6 addresses are listed after IPv4 (which
may already be the case) and it still complains, my recommendation would
be to comment out or otherwise disable the IPv6 master address(es) until
such time as the slave hosts have some type of routable Internet IPv6
connectivity.

$ dig +short ns1.svlug.org. A ns1.svlug.org. AAAA
64.62.190.98
$

I'd guess issue is same for ns1.svlug.org. - I don't see any IPv6
addresses there, but I don't have access to the host, as far as I'm
aware, so am not in position to test more fully from that slave host.
So, if same or quite similar applies to that slave host, then same
recommendation would apply.

I'm also guestimating, the DNS slave server software likely checks
*all* masters - I'd expect that - probably starts with SOA, and then
grabs "highest" (per the DNS RFCs) SOA serial number zone successfully
found from master it successfully did SOA query on.  It may or may not
have some fallback to try AXFR if SOA fails, or if SOA succeeds but AXFR
fails, there's probably some retry/failover algorithm it uses among the
master(s) configured for that slave for that zone.  So, it probably at
least *tries* all configured masters, and depending upon the software,
and possible configuration and logging levels, it may complain for any
it fails to get the data from, or I'd think at least most minimally, if
it failed with all masters - but I'm guessing it would more likely
complain if it failed with *any* of the masters, as one may generally
want to at least minimally have that logged, for possible
investigation/action (expected event/maintenance, or known outage, or
... some other issue/problem)?

Miscellaneous comments :-) :
IPv6 - random John and Jane Doe consumer might not yet know or care (and
may never know, or particularly care), but The Internet is (slowly)
going the way of IPv6.  IPv4 won't disappear anytime soon.  However,
with the exhaustion of IPv4 addresses, increasingly clients are not only
dual-stack (or have some funky way to still be able to access IPv4),
but increasingly there will be clients much prefer IPv6 address and/or
do or will only have IPv6 access.  IPv6 adoption has also been growing
at a relatively fast rate.  If I recall correctly, last year IPv6 growth
was up something like 10%.  I remember even about 4 years ago, looking
at DNS traffic of a major provider, AAAA queries were up to nearly 2% of
volume of queries, relative to A+AAAA - that was up about 300% compared
to only about a year or two earlier.  Not sure about similar current stats,
but in general, any such clients that are IPv6 only, those would
typically be lost traffic/customers, or they may be stuck with some less
preferred means of IPv4 access.  We're well past the time where at least
most all major providers should also be offering IPv6 connectivity (e.g.
web sites, etc.).  The big players in ISPs space all (or nearly all?) do
IPv6, some of the smaller ISPs are still getting there (and even some of
the colos aren't fully there yet).  But in any case, IPv6 is coming,
and one should prepare and get ready for it (but not an extreme rush for
everyone - if you're domestic only web site with only domestic users,
may not be much of a rush to use IPv6 ... but dang, it certainly is also
nice to not be squeezed with very limited number of IPv4 addresses to
work with).

Oh, if one doesn't have IPv6 available from one's ISP, there are IPv6
tunnel brokers, so one can use IPv6 over IPv4 - and thus reach the IPv6
Internet.  E.g. Hurricane Electric provides such - for free (at least as
in beer).  They also have a lot of excellent IPv6 training materials
(though a bit dated now, it's still mostly quite applicable).
Anyway, I do have such IPv6 tunnels from Hurricane Electric (alas, my
home ISP isn't yet offering native IPv6 ... I did also notice my work
cellular, didn't have IPv6 only a few months or so ago, but a month or
two after that, had added IPv6).  Anyway, have an IPv6 tunnel for myself.
Also have one for BALUG (I did a separate account for BALUG, so it's
quite independent of my personal account).
 From Hurricane Electric:
https://ipv6.he.net/certification/
https://tunnelbroker.net/
They also offer some free DNS services ... but they may be slightly
funky in how they behave, and also somewhat limited, in at least some
regards:
https://dns.he.net/
But those do also include IPv6 nameservers.

Also, in the ssh examples above, I didn't include bothering to show use
of ssh-add - generally used in advance to decrypt private key(s) and
then temporarily (for specified length of time) hold the decrypted
private key(s) in RAM - best to never have the keys in the clear written
to non-volatile storage (e.g. disk or SSD or similar).  The ssh-add
program also takes care to not let private keys be written to swap - but
I also have my swap encrypted, and likewise most of my drive storage.

And in the above regular expressions where there's character class
([...]) showing whitespace within, the whitespace is generally a space
and a tab (my email client doesn't conveniently allow me to literally
pass that along - and even if I did, clients displaying such may not
preserve such in display, and in any case it wouldn't generally be
visually obvious).  Some RE contents allow \t for tab, but those
I showed above generally don't support that.

And retrying earlier test:
$ ssh -ax m-net.arbornet.org. hostname
m-net.arbornet.org
$

That works now, so trying further with that ...

$ ssh -ax m-net.arbornet.org. 'hostname &&
> PATH="/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:$PATH" \
> && { if >>/dev/null 2>&1 type ip; then ip -6 a s; ip -6 r s
> else ifconfig -a; netstat -nr; fi; }'
m-net.arbornet.org
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
         options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
         ether 00:50:56:8b:48:cf
         inet 162.202.67.157 netmask 0xffffffe0 broadcast 162.202.67.159
         inet6 fe80::250:56ff:fe8b:48cf%em0 prefixlen 64 scopeid 0x1
         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
         media: Ethernet autoselect (1000baseT <full-duplex>)
         status: active
em1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
         options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
         ether 00:0c:29:34:2e:d3
         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
         media: Ethernet autoselect
         status: no carrier
plip0: flags=8810<POINTOPOINT,SIMPLEX,MULTICAST> metric 0 mtu 1500
         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
         options=3<RXCSUM,TXCSUM>
         inet6 ::1 prefixlen 128
         inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
         inet 127.0.0.1 netmask 0xff000000
         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
ipfw0: flags=8801<UP,SIMPLEX,MULTICAST> metric 0 mtu 65536
         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Routing tables

Internet:
Destination        Gateway            Flags    Refs      Use  Netif Expire
default            162.202.67.129     UGS         0     4573    em0
127.0.0.1          link#4             UH          0    47170    lo0
162.202.67.128/27  link#1             U           0      127    em0
162.202.67.157     link#1             UHS         0        0    lo0

Internet6:
Destination                       Gateway                       Flags   
     Netif Expire
::/96                             ::1                           UGRS    
      lo0
::1                               ::1                           UH      
      lo0
::ffff:0.0.0.0/96                 ::1                           UGRS    
      lo0
fe80::/10                         ::1                           UGRS    
      lo0
fe80::%em0/64                     link#1                        U       
      em0
fe80::250:56ff:fe8b:48cf%em0      link#1                        UHS     
      lo0
fe80::%lo0/64                     link#4                        U       
      lo0
fe80::1%lo0                       link#4                        UHS     
      lo0
ff01::%em0/32                     fe80::250:56ff:fe8b:48cf%em0  U       
      em0
ff01::%lo0/32                     ::1                           U       
      lo0
ff02::/16                         ::1                           UGRS    
      lo0
ff02::%em0/32                     fe80::250:56ff:fe8b:48cf%em0  U       
      em0
ff02::%lo0/32                     ::1                           U       
      lo0
$

I don't think I see anything there that's globally routable IPv6
ff0x::114       Used for experiments
... yeah, I was wondering what ff01::... is/was ... wasn't able to
identify that.  Well, let's try wee bit:

$ ssh -ax m-net.arbornet.org. 'hostname &&
> PATH="/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:$PATH" \
> && >>/dev/null 2>&1 type ping6 && ping6 -n -c 5 2001:470:1f04:19e::2'
m-net.arbornet.org
ping6: UDP connect: No route to host
$

Yep, it doesn't have a route to get there, so can't test further from
there.

>> From: "Rick Moen" <rick@linuxmafia.com>
>> Subject: Re: [BALUG-Admin] DNS slaves for BALUG?  :-)
>> Date: Sat, 20 Feb 2016 18:33:09 -0800
>
>> I wrote:
>>
>>> Quoting Michael Paoli (Michael.Paoli@cal.berkeley.edu):
>>>
>>>> If you could please, and would be willing,
>>>> could you cover DNS slave services for BALUG,
>>>> notably these zones:
>>>> e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa
>>>> balug.org
>>>> master(s) (all for each of the above):
>>>> 198.144.194.238
>>>> 2001:470:1f04:19e::2
>>>
>>> FYI, ns1.linuxmafia.com is repeatedly logging this:
>>>
>>> Feb 20 17:50:45 linuxmafia named[17291]: zone  
>>> e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: failure  
>>> trying master 2001:470:1f04:19e::2#53 (source ::#0): operation  
>>> canceled
>>>
>>> I'm unsure why the AXFR/IXFR to the IPv6 address fails, and currently
>>> lack the time and patience to pursue that matter.  For now, I'm going to
>>> disable that master IP on my BIND9 conffile.
>>
>> Before I do that, Michael, would you like to any some checking on the
>> _master_ nameserver end?  The most probable explanation is some deficiency
>> in IPv6 support on ns1.linuxmafia.com, and that would be my default
>> assumption.
>>
>> FYI, both zones are affected:
>>
>> linuxmafia:/etc/bind# zgrep 'failure trying master' /var/log/*
>> /var/log/daemon.log:Feb 20 12:55:59 linuxmafia named[17291]: zone  
>> balug.org/IN: refresh: failure trying master  
>> 2001:470:1f04:19e::2#53 (source ::#0): operation canceled
>> /var/log/daemon.log:Feb 20 13:39:16 linuxmafia named[17291]: zone  
>> e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: failure  
>> trying master 2001:470:1f04:19e::2#53 (source ::#0): operation  
>> canceled
>> /var/log/daemon.log:Feb 20 17:50:45 linuxmafia named[17291]: zone  
>> e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: failure  
>> trying master 2001:470:1f04:19e::2#53 (source ::#0): operation  
>> canceled
>> /var/log/syslog:Feb 20 12:55:59 linuxmafia named[17291]: zone  
>> balug.org/IN: refresh: failure trying master  
>> 2001:470:1f04:19e::2#53 (source ::#0): operation canceled
>> /var/log/syslog:Feb 20 13:39:16 linuxmafia named[17291]: zone  
>> e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: failure  
>> trying master 2001:470:1f04:19e::2#53 (source ::#0): operation  
>> canceled
>> /var/log/syslog:Feb 20 17:50:45 linuxmafia named[17291]: zone  
>> e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: failure  
>> trying master 2001:470:1f04:19e::2#53 (source ::#0): operation  
>> canceled
>> linuxmafia:/etc/bind#