[BALUG-Admin] High(er) availability for BALUG ancillary sites

Michael Paoli Michael.Paoli@cal.berkeley.edu
Thu Nov 26 11:57:11 PST 2015

So, ... now also high(er) availability for the BALUG ancillary sites.
Also upgraded "vicki" to Debian GNU/Linux 8.2 (jessie) some days or so
earlier ... but note however that "vicki" is the physical host that
now only the *sometimes* has the balug VM running atop it (much/most
of the time that balug VM is running atop my personal laptop).

Nevertheless (and because of that - and live migration!) ... high(er)  

----- Forwarded message from Michael.Paoli@cal.berkeley.edu -----
     Date: Thu, 26 Nov 2015 11:30:47 -0800
     From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
  Subject: High(er) availability for SF-LUG site(s) (& some BALUG stuff too).
       To: SF-LUG <sf-lug@linuxmafia.com>

High(er) availability for SF-LUG site(s) (& some BALUG stuff too).

This covers the web sites & DNS master (list is hosted separately by
Rick Moen), and this also covers fair bit of BALUG stuff too (everything
*except* BALUG's [www.]balug.org and list stuff - e.g. wiki, archives,
test/beta/staging sites, etc.)

Anyway, more-or-less per earlier plan, I did get so far yesterday, as
doing the first *live* migration of that host.  And also *without*
shared storage (which does also work perfectly fine - just takes a wee
bit longer for the actual overall migration, but still when the actual
final switch itself happens for guest VM itself, it's exceedingly fast
(on the order of 10s of milliseconds or so? - I haven't actually timed
that final bit quite yet).  So, ... that VM host is no longer "stuck"
just on my personal laptop :-) ... which means it can very much remain
up and Internet accessible - even when my laptop isn't (or, e.g. travels
away from home).

$ wakeonlan 00:30:48:91:97:90
Not 100% anticipated, but not a huge surprise, and easy enough to
address - some of the last kinks to be worked out were in allowing the
live migration to be successful.  CPU type and flags/capability:
error: unsupported configuration: guest and host CPU are not  
compatible: Host CPU does not provide required features: popcnt,  
sse4.2, sse4.1
Ah, so laptop CPU wee bit more modern than that on "vicki" - and by
more-or-less default, guest CPU was configured to take advantage of
many/most of those host CPU capabilities.  Easy enough to deal with that
- bring the VM guest down, reconfigure the virtual CPU to disable those
3 capabilities, bring VM guest back up again, and repeat the attempt -
made it fine past that error.  Next glitch was a bit more puzzling:
error: internal error: unable to execute QEMU command 'migrate': State  
blocked by non-migratable device '0000:00:05.0/ich9_ahci
Wee bit 'o search and ... QEMU can't live migrate SATA (at least not
yet safely in version I'm using, and at least by default migration of
such is disabled for safety reasons).  Bring host down, take virtual
hard drive off of SATA, turn it into SCSI and attach it to SCSI ... and
... same error?  Checked configuration again - nothing attached to
(virtual) SATA bus/controller, but the SATA bus/controller still there,
... next step, remove those, and repeat ... and ... success, all went
fine, no errors:
# virsh migrate --live --persistent --copy-storage-all --verbose balug \
> qemu+ssh:// && virsh autostart --disable balug
... and all went fine and dandy.  And then, live migrating back:
# virsh migrate --live --undefinesource --copy-storage-all --verbose \
> balug qemu+ssh://tigger.mpaoli.net./system
And that went perfectly fine too, not so much as a glitch to notice on
the guest VM itself (though the storage replication took a while, so
it's not a speedy move from perspective of the physical hosts) ... TCP
connections between guest and Internet, etc., all maintained perfectly
fine across the live migrations.

Wee bit more stuff to do / work on ... e.g. (at least theoretically),
o Turn it into a (nearly) push-button operation (run one relatively
   simple script - or pair of scripts - partially drafted, but yet to
   polish those off.).
o Investigate/test --copy-storage-inc - if suitable and safe, that may
   significantly speed the disk data copy portion of the migration (some
   of the storage I have set up is highly optimized for physical storage
   space reduction, but consequently has very low write performance
   characteristics - which is mostly quite fine, but slows migration
   especially back to laptop; read performance, however, is more than
   sufficient.  E.g. on physical host (laptop) we have:
   # ls -hnos balug-sda
   4.8G -rw------- 1 114 16G Nov 26 18:41 balug-sda
   Quite efficient (deduplication + compression) space utilization - but
   at cost of write performance (and some CPU burn, particularly on
   heavy writing, and some more suck of RAM too) - but that happens to
   be the trade-off I want the majority of the time for that storage -
   so that's highly acceptable (laptop SSD is "only" about 150 GiB ...
   and I've a whole lot of other stuff on it too - I'm fine with LUG VM
   taking ~5 GiB of physical storage ... but not gonna give it 16 GiB!).
   Where it resides, it also does deduplication across some ISOs that
   quite correspond to the installed operating system (and also other
   data), so that also aids in reducing total physical storage space
o I'll also carefully review, and likely adjust/tweak other bits of the
   migration options and handling of the VMs after migration - mostly
   notably bits regarding undefine or not, and autostart or not - and
   where.  And of course test it all out more fully.  :-)

> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
> Subject: Re: sf-lug site & hardware
> Date: Tue, 24 Nov 2015 06:32:02 -0800

> Just an FYI update.
> So, my (overly optimistic) theoretical timeline - was hoping to have
> the sf-lug site relocated onto the higher availability hardware
> (notably not on VM on my laptop) by around 2015-11-15 or so.  Have
> adjusted the target timeline a bit, after some considerations (and also
> being relatively busy with other stuff too).  Anyway, one thing I
> didn't fully take into account earlier - fan noise.  That system that
> was in the colo - 1U unit, is comparatively noisy (I've gotten a bit
> spoiled mostly not listening to fan noise of such volume - even though
> it uses a fan and airflow design that mostly avoids tiny 1U high-RPM
> fan(s) - it's still noisier than most typical desktop systems - but
> less noisy than many typical 1U servers).  So, ... I adjusted my
> (theoretical) plans a bit.  With wakeonlan, qemu-kvm live migration,
> and wee bit of infrastructure (which I was mostly planning to do
> anyway), and small bit of scripting, I could arrange to have the VM
> running on the noisier (but higher availability) hardware, mostly only
> when it wouldn't be running on my laptop at home.  And with live
> migration, the migration would be effectively "invisible" to the guest
> VM host itself, its state, connections to it and sessions on it, etc.
> Anyway, fair bit closer to having that plan fully implemented.  Current
> target timeline for completion, by 2015-11-29, or at least not later
> than 2015-12-13.  May be fair bit sooner.  I'll update once it's in
> place and fully operational (did get a fair chunk of related
> infrastructure completed yesterday and today).
> references/excerpts:
> https://en.wikipedia.org/wiki/Wake-on-LAN
> https://en.wikipedia.org/wiki/Live_migration
>> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
>> Subject: Re: sf-lug site & hardware
>> Date: Thu, 12 Nov 2015 14:01:41 -0800
>> FYI, this morning Jim Stockford and I did retrieve the physical server
>> host from the colo, upon which, up until some months back, the sf-lug
>> web site was running.  So, that improves the hardware resource
>> situation.  I'm guestimating I'll have the sf-lug website again running
>> on VM atop this hardware by sometime this weekend or so - that should
>> improve the availability a fair bit (notably the sf-lug website then
>> won't go down when my personal laptop goes down, offline, or out the
>> door from home).
>> Thanks Jim!
>>> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
>>> Subject: Re: Have you guys thought about http://www.freelists.org/  
>>> (hosted ...)
>>> Date: Wed, 11 Nov 2015 18:26:25 -0800
>>> would be down or that it wasn't (relatively) high availability (at least
>>> compared to virtual machine running on my personal laptop - which does
>>> have the sf-lug site go out when my laptop goes out ... hopefully that
>>> situation will be improved in near future ... waiting on some resources
>>> to be able to do that.)
>>> references/excerpts:
>>> http://linuxmafia.com/pipermail/sf-lug/2015q4/011454.html
>>> http://linuxmafia.com/pipermail/sf-lug/2015q4/011441.html
>>>> From: Shane Tzen <shane@faultymonk.org>
>>>> Date: Wed, 11 Nov 2015 15:56:14 -0800
>>>> Subject: Re: [sf-lug] updated/upgraded: SF-LUG - operating system  
>>>> presently hosting
>>>> To: Michael Paoli <Michael.Paoli@cal.berkeley.edu>
>>>> Cc: SF-LUG <sf-lug@linuxmafia.com>
>>>> Have you guys thought about http://www.freelists.org/about.html ?
>>>> Looks like various LUGs are hosted -
>>>> http://www.freelists.org/cat/Linux_and_UNIX
>>>> On Fri, Oct 30, 2015 at 3:41 AM, Michael Paoli <
>>>> Michael.Paoli@cal.berkeley.edu> wrote:
>>>>> It's been updated/upgraded:
>>>>> from: Debian GNU/Linux 7.9 (wheezy)
>>>>> to: Debian GNU/Linux 8.2 (jessie)
>>>>> http://lists.balug.org/pipermail/balug-admin-balug.org/2015-October/002989.html
>>>>> Still definitely *not* high availability though (alas, still sits atop
>>>>> a virtual machine on my *laptop*!).
>>>>> Hopefully in not too horribly distant future (like *real soon*), the
>>>>> physical box the site was earlier running upon will be successfully
>>>>> retrieved - once that happens, some high(er) availability options
>>>>> become possible.
>>>>> Let me know if you notice anything awry (notwithstanding the less than
>>>>> high availability).
>>>>> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
>>>>>> Subject: It's alive*!: Re: SF-LUG - DNS, web site, ..., etc.
>>>>>> Date: Mon, 24 Aug 2015 03:10:26 -0700
>>>>> Anyway, have taken the liberty ...
>>>>>> it's alive* ...
>>>>>> the [www.]sf-lug.{org,com}
>>>>>> websites are available again.

----- End forwarded message -----

> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
> Subject: BALUG ancillary sites UPgraded
> Date: Fri, 30 Oct 2015 03:33:52 -0700

> Upgraded (updated) from: Debian GNU/Linux 7.9 (wheezy)
> to: Debian GNU/Linux 8.2 (jessie)
> If you notice anything awry, let me know.
> I did notice the wiki pages do have a rather different look & feel.
>> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
>> Subject: BALUG ancillary sites UP: Re: Of relevance for BALUG  
>> ancillary sites
>> Date: Sat, 22 Aug 2015 03:00:16 -0700
>> These BALUG sites are UP again.
>> Due note, however, that the new location is *not* high availability,
>> nor high bandwidth.
>> Specifically, at present, the VM hosting those sites has been move onto
>> a laptop (8-O !) ... which does occasionally get out and about (and when
>> it does, those BALUG sites will be temporarily down again), and it's
>> sitting at the end of a DSL line - which also shares bandwidth with
>> other hosts.
>> I'll likely be doing some things in the coming weeks or more to improve
>> the availability (notably move off the present laptop where the VM is
>> currently), and may at some future point also have improved bandwidth -
>> but for now, is what it is.
>> Let me know if anyone notices anything particularly amiss (notwihstanding
>> the inherent limitations and availability limitations noted above).
>>> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
>>> Subject: BALUG ancillary sites DOWN: Re: Of relevance for BALUG  
>>> ancillary sites
>>> Date: Thu, 20 Aug 2015 09:05:43 -0700
>>> The BALUG ancillary sites were still up yesterday morning, but they're
>>> DOWN now.  I'm presuming power/network has been removed from the "vicki"
>>> physical host at the colo site.  I expect there will be more about this
>>> on the SF-LUG list *real soon now*.
>>> http://linuxmafia.com/mailman/listinfo/sf-lug
>>> I do have recent backups - having reasonably anticipated this scenario,
>>> I have backups from this as recent as yesterday morning.
>>>> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
>>>> Subject: Of relevance for BALUG ancillary sites
>>>> Date: Sat, 25 Jul 2015 01:30:48 -0700
>>>> Just a bit of background.  Some BALUG sites may be impacted soon,  
>>>> but *not*
>>>> www.balug.org or balug.org itself.
>>>> See:
>>>> http://lists.balug.org/pipermail/balug-admin-balug.org/2014-July/002233.html
>>>> for *which* sites will (and won't) be impacted
>>>> As to some background and why, have a look at:
>>>> http://linuxmafia.com/pipermail/sf-lug/2015q3/011330.html
>>>> And, for the BALUG specific bits I left out of the above:
>>>>> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
>>>>> Subject: Re: vicki (exodus) & GoGrid/Datapipe
>>>>> Date: Thu, 23 Jul 2015 03:14:08 -0700
>>>>> BALUG: what balug has on the virtual machine atop vicki is certainly
>>>>> highly useful and convenient, and would be relative pain to be without,
>>>>> but we could survive without, at least for a bit, and have in fact
>>>>> relocated it (temporarily) elsewhere in past.
>>>>> What BALUG does *not* have atop the (VM on the) vicki host:
>>>>> www.balug.org, and balug.org itself aren't on that host, but other
>>>>> sites/subdomains are, e.g. [www.]{archive,wiki}.balug.org, among others.
>>>>> DNS - only for some subdomains, again not balug.org. itself.

