----- Forwarded message from logcheck system account logcheck@linuxmafia.com -----
Date: Thu, 25 Aug 2016 00:02:02 -0700 From: logcheck system account logcheck@linuxmafia.com To: root@linuxmafia.com Subject: linuxmafia.com 2016-08-25 00:02 System Events
System Events =-=-=-=-=-=-= Aug 24 23:12:07 linuxmafia named[1064]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:12:07 linuxmafia named[1064]: zone balug.org/IN: Transfer started. Aug 24 23:12:28 linuxmafia named[1064]: transfer of 'balug.org/IN' from 198.144.194.238#53: failed to connect: timed out Aug 24 23:12:28 linuxmafia named[1064]: transfer of 'balug.org/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 21.000 secs (0 bytes/sec) Aug 24 23:14:03 linuxmafia named[1064]: zone sf-lug.com/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:16:12 linuxmafia named[1064]: zone sf-lug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:39:12 linuxmafia named[1064]: zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:39:12 linuxmafia named[1064]: zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: Transfer started. Aug 24 23:39:33 linuxmafia named[1064]: transfer of 'e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN' from 198.144.194.238#53: failed to connect: timed out Aug 24 23:39:33 linuxmafia named[1064]: transfer of 'e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 21.000 secs (0 bytes/sec) Aug 24 23:39:41 linuxmafia named[1064]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0)
----- End forwarded message -----
Aw crud ... thanks for catching that, well, at least caught the diagnostic this time: [217241.840094] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [217241.840094] Workqueue: scsi_tmf_0 scmd_eh_abort_handler [scsi_mod] [217241.840094] task: ffff88001f016210 ti: ffff88001f074000 task.ti: ffff88001f074000 [217241.840094] RIP: 0010:[<ffffffffa009d282>] [<ffffffffa009d282>] sym_interrupt+0xb92/0x1d20 [sym53c8xx] [217241.840094] RSP: 0018:ffff88001fc03e70 EFLAGS: 00010046 [217241.840094] RAX: 0000000000000001 RBX: ffff88001f3c7000 RCX: ffff88001f3c7130 [217241.840094] RDX: 0000000000000000 RSI: 00febf4b70000000 RDI: ffffc90000002006 [217241.840094] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88001d800008 [217241.840094] R10: 0000000000000000 R11: ffff88001f0779ce R12: 0000000000000007 [217241.840094] R13: ffff88001f3c7090 R14: ffff88001f370000 R15: 0000000000000010 [217241.840094] FS: 0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 [217241.840094] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [217241.840094] CR2: 00007f2c1c14e280 CR3: 000000001c502000 CR4: 00000000000006f0 [217241.840094] Stack: [217241.840094] 0000000000000000 ffff88001fc03e78 ffff880000000000 ffff88001df31000 [217241.840094] 0000000000000000 ffff88001fc0d1a0 ffff88001f077b08 0000c5948e786d92 [217241.840094] ffff88001f370000 000000000000000a 000000000000003a 0000000000000000 [217241.840094] Call Trace: [217241.840094] <IRQ> [217241.840094] [<ffffffffa0097a6a>] ? sym53c8xx_intr+0x3a/0x80 [sym53c8xx] [217241.840094] [<ffffffff810bb2b5>] ? handle_irq_event_percpu+0x35/0x190 [217241.840094] [<ffffffff810bb448>] ? handle_irq_event+0x38/0x60 [217241.840094] [<ffffffff810be683>] ? handle_fasteoi_irq+0x83/0x150 [217241.840094] [<ffffffff810150fd>] ? handle_irq+0x1d/0x30 [217241.840094] [<ffffffff81517019>] ? do_IRQ+0x49/0xe0 [217241.840094] [<ffffffff81514e6d>] ? common_interrupt+0x6d/0x6d [217241.840094] <EOI> [217241.840094] [<ffffffffa0098781>] ? sym_eh_handler+0x1f1/0x350 [sym53c8xx] [217241.840094] [<ffffffffa006d91f>] ? scmd_eh_abort_handler+0xbf/0x490 [scsi_mod] [217241.840094] [<ffffffff81081742>] ? process_one_work+0x172/0x420 [217241.840094] [<ffffffff81081dd3>] ? worker_thread+0x113/0x4f0 [217241.840094] [<ffffffff81081cc0>] ? rescuer_thread+0x2d0/0x2d0 [217241.840094] [<ffffffff8108800d>] ? kthread+0xbd/0xe0 [217241.840094] [<ffffffff81087f50>] ? kthread_create_on_node+0x180/0x180 [217241.840094] [<ffffffff81514158>] ? ret_from_fork+0x58/0x90 [217241.840094] [<ffffffff81087f50>] ? kthread_create_on_node+0x180/0x180 [217241.840094] Code: 31 c0 48 8b 72 20 48 85 f6 74 06 80 7e 34 00 75 2a 48 8b 72 28 48 85 f6 74 3f 31 d2 eb 0d 48 83 c2 08 48 81 fa f8 01 00 00 74 2e <48> 8b 7c 16 08 48 85 ff 74 e9 80 7f 34 00 74 e3 48 63 d0 48 8d [217241.840094] RIP [<ffffffffa009d282>] sym_interrupt+0xb92/0x1d20 [sym53c8xx] [217241.840094] RSP <ffff88001fc03e70> [217241.840094] ---[ end trace e975719a8c8b2a5e ]--- [217241.840094] Kernel panic - not syncing: Fatal exception in interrupt [217241.840094] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) [217241.840094] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Anyway, another kick in the reset, and should be "fine" again now.
From: "Rick Moen" rick@linuxmafia.com Subject: Master nameserver offline again? Date: Thu, 25 Aug 2016 00:05:40 -0700
----- Forwarded message from logcheck system account logcheck@linuxmafia.com -----
Date: Thu, 25 Aug 2016 00:02:02 -0700 From: logcheck system account logcheck@linuxmafia.com To: root@linuxmafia.com Subject: linuxmafia.com 2016-08-25 00:02 System Events
System Events
Aug 24 23:12:07 linuxmafia named[1064]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:12:07 linuxmafia named[1064]: zone balug.org/IN: Transfer started. Aug 24 23:12:28 linuxmafia named[1064]: transfer of 'balug.org/IN' from 198.144.194.238#53: failed to connect: timed out Aug 24 23:12:28 linuxmafia named[1064]: transfer of 'balug.org/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 21.000 secs (0 bytes/sec) Aug 24 23:14:03 linuxmafia named[1064]: zone sf-lug.com/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:16:12 linuxmafia named[1064]: zone sf-lug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:39:12 linuxmafia named[1064]: zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:39:12 linuxmafia named[1064]: zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: Transfer started. Aug 24 23:39:33 linuxmafia named[1064]: transfer of 'e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN' from 198.144.194.238#53: failed to connect: timed out Aug 24 23:39:33 linuxmafia named[1064]: transfer of 'e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 21.000 secs (0 bytes/sec) Aug 24 23:39:41 linuxmafia named[1064]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0)
----- End forwarded message -----
Oops ... that one was actually my booboo. And one earlier crash/hang. <sigh> Forgot an important relevant detail earlier.
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu Subject: and up again: Re: Master nameserver offline again? Date: Thu, 25 Aug 2016 05:11:33 -0700
Aw crud ... thanks for catching that, well, at least caught the diagnostic this time: [217241.840094] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [217241.840094] Workqueue: scsi_tmf_0 scmd_eh_abort_handler [scsi_mod] [217241.840094] task: ffff88001f016210 ti: ffff88001f074000 task.ti: ffff88001f074000 [217241.840094] RIP: 0010:[<ffffffffa009d282>] [<ffffffffa009d282>] sym_interrupt+0xb92/0x1d20 [sym53c8xx] [217241.840094] RSP: 0018:ffff88001fc03e70 EFLAGS: 00010046 [217241.840094] RAX: 0000000000000001 RBX: ffff88001f3c7000 RCX: ffff88001f3c7130 [217241.840094] RDX: 0000000000000000 RSI: 00febf4b70000000 RDI: ffffc90000002006 [217241.840094] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88001d800008 [217241.840094] R10: 0000000000000000 R11: ffff88001f0779ce R12: 0000000000000007 [217241.840094] R13: ffff88001f3c7090 R14: ffff88001f370000 R15: 0000000000000010 [217241.840094] FS: 0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 [217241.840094] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [217241.840094] CR2: 00007f2c1c14e280 CR3: 000000001c502000 CR4: 00000000000006f0 [217241.840094] Stack: [217241.840094] 0000000000000000 ffff88001fc03e78 ffff880000000000 ffff88001df31000 [217241.840094] 0000000000000000 ffff88001fc0d1a0 ffff88001f077b08 0000c5948e786d92 [217241.840094] ffff88001f370000 000000000000000a 000000000000003a 0000000000000000 [217241.840094] Call Trace: [217241.840094] <IRQ> [217241.840094] [<ffffffffa0097a6a>] ? sym53c8xx_intr+0x3a/0x80 [sym53c8xx] [217241.840094] [<ffffffff810bb2b5>] ? handle_irq_event_percpu+0x35/0x190 [217241.840094] [<ffffffff810bb448>] ? handle_irq_event+0x38/0x60 [217241.840094] [<ffffffff810be683>] ? handle_fasteoi_irq+0x83/0x150 [217241.840094] [<ffffffff810150fd>] ? handle_irq+0x1d/0x30 [217241.840094] [<ffffffff81517019>] ? do_IRQ+0x49/0xe0 [217241.840094] [<ffffffff81514e6d>] ? common_interrupt+0x6d/0x6d [217241.840094] <EOI> [217241.840094] [<ffffffffa0098781>] ? sym_eh_handler+0x1f1/0x350 [sym53c8xx] [217241.840094] [<ffffffffa006d91f>] ? scmd_eh_abort_handler+0xbf/0x490 [scsi_mod] [217241.840094] [<ffffffff81081742>] ? process_one_work+0x172/0x420 [217241.840094] [<ffffffff81081dd3>] ? worker_thread+0x113/0x4f0 [217241.840094] [<ffffffff81081cc0>] ? rescuer_thread+0x2d0/0x2d0 [217241.840094] [<ffffffff8108800d>] ? kthread+0xbd/0xe0 [217241.840094] [<ffffffff81087f50>] ? kthread_create_on_node+0x180/0x180 [217241.840094] [<ffffffff81514158>] ? ret_from_fork+0x58/0x90 [217241.840094] [<ffffffff81087f50>] ? kthread_create_on_node+0x180/0x180 [217241.840094] Code: 31 c0 48 8b 72 20 48 85 f6 74 06 80 7e 34 00 75 2a 48 8b 72 28 48 85 f6 74 3f 31 d2 eb 0d 48 83 c2 08 48 81 fa f8 01 00 00 74 2e <48> 8b 7c 16 08 48 85 ff 74 e9 80 7f 34 00 74 e3 48 63 d0 48 8d [217241.840094] RIP [<ffffffffa009d282>] sym_interrupt+0xb92/0x1d20 [sym53c8xx] [217241.840094] RSP <ffff88001fc03e70> [217241.840094] ---[ end trace e975719a8c8b2a5e ]--- [217241.840094] Kernel panic - not syncing: Fatal exception in interrupt [217241.840094] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) [217241.840094] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Anyway, another kick in the reset, and should be "fine" again now.
From: "Rick Moen" rick@linuxmafia.com Subject: Master nameserver offline again? Date: Thu, 25 Aug 2016 00:05:40 -0700
----- Forwarded message from logcheck system account logcheck@linuxmafia.com -----
Date: Thu, 25 Aug 2016 00:02:02 -0700 From: logcheck system account logcheck@linuxmafia.com To: root@linuxmafia.com Subject: linuxmafia.com 2016-08-25 00:02 System Events
System Events
Aug 24 23:12:07 linuxmafia named[1064]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:12:07 linuxmafia named[1064]: zone balug.org/IN: Transfer started. Aug 24 23:12:28 linuxmafia named[1064]: transfer of 'balug.org/IN' from 198.144.194.238#53: failed to connect: timed out Aug 24 23:12:28 linuxmafia named[1064]: transfer of 'balug.org/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 21.000 secs (0 bytes/sec) Aug 24 23:14:03 linuxmafia named[1064]: zone sf-lug.com/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:16:12 linuxmafia named[1064]: zone sf-lug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:39:12 linuxmafia named[1064]: zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0) Aug 24 23:39:12 linuxmafia named[1064]: zone e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN: Transfer started. Aug 24 23:39:33 linuxmafia named[1064]: transfer of 'e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN' from 198.144.194.238#53: failed to connect: timed out Aug 24 23:39:33 linuxmafia named[1064]: transfer of 'e.9.1.0.5.0.f.1.0.7.4.0.1.0.0.2.ip6.arpa/IN' from 198.144.194.238#53: Transfer completed: 0 messages, 0 records, 0 bytes, 21.000 secs (0 bytes/sec) Aug 24 23:39:41 linuxmafia named[1064]: zone balug.org/IN: refresh: retry limit for master 198.144.194.238#53 exceeded (source 0.0.0.0#0)
----- End forwarded message -----
Quoting Michael Paoli (Michael.Paoli@cal.berkeley.edu):
Oops ... that one was actually my booboo. And one earlier crash/hang. <sigh> Forgot an important relevant detail earlier.
I cannot tell from that dmesg log what the likely root cause of the kernel panic is, except it looked like a depressingly non-specific SCSI-layer error. Which unfortunately leaves open a lot of possibilities. A classic media-fault-caused error stream, in my experience, reports 'Sense Key Error' language, at least on most such occasions(?). This one is more vague than that.
So, sux0rs: Could be any of a variety of root causes, but my hunch is that it's hardware, anyway. Good luck!
Reminder: I'm still stuck in 'WTF?' mode about this, and looking for a comment or corrective action:
$ whois balug.org | grep '^Name Server' Name Server: NS1.DREAMHOST.COM Name Server: NS2.DREAMHOST.COM Name Server: NS3.DREAMHOST.COM $
Just re-confirmed that as being still true: Domain balug.org declares three Dreamhost corporate nameservers authoritative _only_ for itself, so, what is the point of ns1.linuxmafia.com also doing slave nameservice, given that ns1.linuxmafia.com will never receive any delegation hence no query traffic?
Suggest either amending the authoritative roster to add all secondaries not yet listed, or thank those non-listed secondaries for their offer to help and say it's not needed at this time.