[BALUG-Admin] High(er) availability for BALUG ancillary sites

26 Nov 2015


      So, ... now also high(er) availability for the BALUG ancillary sites.
Also upgraded "vicki" to Debian GNU/Linux 8.2 (jessie) some days or so
earlier ... but note however that "vicki" is the physical host that
now only the *sometimes* has the balug VM running atop it (much/most
of the time that balug VM is running atop my personal laptop).
Nevertheless (and because of that - and live migration!) ... high(er)  
availability!:
----- Forwarded message from Michael.Paoli@cal.berkeley.edu -----
     Date: Thu, 26 Nov 2015 11:30:47 -0800
     From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
  Subject: High(er) availability for SF-LUG site(s) (& some BALUG stuff too).
       To: SF-LUG sf-lug@linuxmafia.com
High(er) availability for SF-LUG site(s) (& some BALUG stuff too).
This covers the web sites & DNS master (list is hosted separately by
Rick Moen), and this also covers fair bit of BALUG stuff too (everything
*except* BALUG's [www.]balug.org and list stuff - e.g. wiki, archives,
test/beta/staging sites, etc.)
Anyway, more-or-less per earlier plan, I did get so far yesterday, as
doing the first *live* migration of that host.  And also *without*
shared storage (which does also work perfectly fine - just takes a wee
bit longer for the actual overall migration, but still when the actual
final switch itself happens for guest VM itself, it's exceedingly fast
(on the order of 10s of milliseconds or so? - I haven't actually timed
that final bit quite yet).  So, ... that VM host is no longer "stuck"
just on my personal laptop :-) ... which means it can very much remain
up and Internet accessible - even when my laptop isn't (or, e.g. travels
away from home).
$ wakeonlan 00:30:48:91:97:90
...
Not 100% anticipated, but not a huge surprise, and easy enough to
address - some of the last kinks to be worked out were in allowing the
live migration to be successful.  CPU type and flags/capability:
error: unsupported configuration: guest and host CPU are not  
compatible: Host CPU does not provide required features: popcnt,  
sse4.2, sse4.1
Ah, so laptop CPU wee bit more modern than that on "vicki" - and by
more-or-less default, guest CPU was configured to take advantage of
many/most of those host CPU capabilities.  Easy enough to deal with that
- bring the VM guest down, reconfigure the virtual CPU to disable those
3 capabilities, bring VM guest back up again, and repeat the attempt -
made it fine past that error.  Next glitch was a bit more puzzling:
error: internal error: unable to execute QEMU command 'migrate': State  
blocked by non-migratable device '0000:00:05.0/ich9_ahci
Wee bit 'o search and ... QEMU can't live migrate SATA (at least not
yet safely in version I'm using, and at least by default migration of
such is disabled for safety reasons).  Bring host down, take virtual
hard drive off of SATA, turn it into SCSI and attach it to SCSI ... and
... same error?  Checked configuration again - nothing attached to
(virtual) SATA bus/controller, but the SATA bus/controller still there,
... next step, remove those, and repeat ... and ... success, all went
fine, no errors:
# virsh migrate --live --persistent --copy-storage-all --verbose balug \
...
qemu+ssh://192.168.55.2/system && virsh autostart --disable balug
... and all went fine and dandy.  And then, live migrating back:
# virsh migrate --live --undefinesource --copy-storage-all --verbose \
...
balug qemu+ssh://tigger.mpaoli.net./system
And that went perfectly fine too, not so much as a glitch to notice on
the guest VM itself (though the storage replication took a while, so
it's not a speedy move from perspective of the physical hosts) ... TCP
connections between guest and Internet, etc., all maintained perfectly
fine across the live migrations.
Wee bit more stuff to do / work on ... e.g. (at least theoretically),
o Turn it into a (nearly) push-button operation (run one relatively
   simple script - or pair of scripts - partially drafted, but yet to
   polish those off.).
o Investigate/test --copy-storage-inc - if suitable and safe, that may
   significantly speed the disk data copy portion of the migration (some
   of the storage I have set up is highly optimized for physical storage
   space reduction, but consequently has very low write performance
   characteristics - which is mostly quite fine, but slows migration
   especially back to laptop; read performance, however, is more than
   sufficient.  E.g. on physical host (laptop) we have:
   # ls -hnos balug-sda
   4.8G -rw------- 1 114 16G Nov 26 18:41 balug-sda
   #
   Quite efficient (deduplication + compression) space utilization - but
   at cost of write performance (and some CPU burn, particularly on
   heavy writing, and some more suck of RAM too) - but that happens to
   be the trade-off I want the majority of the time for that storage -
   so that's highly acceptable (laptop SSD is "only" about 150 GiB ...
   and I've a whole lot of other stuff on it too - I'm fine with LUG VM
   taking ~5 GiB of physical storage ... but not gonna give it 16 GiB!).
   Where it resides, it also does deduplication across some ISOs that
   quite correspond to the installed operating system (and also other
   data), so that also aids in reducing total physical storage space
   consumed.
   ).
o I'll also carefully review, and likely adjust/tweak other bits of the
   migration options and handling of the VMs after migration - mostly
   notably bits regarding undefine or not, and autostart or not - and
   where.  And of course test it all out more fully.  :-)
...
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
Subject: Re: sf-lug site & hardware
Date: Tue, 24 Nov 2015 06:32:02 -0800
...
Just an FYI update.
So, my (overly optimistic) theoretical timeline - was hoping to have
the sf-lug site relocated onto the higher availability hardware
(notably not on VM on my laptop) by around 2015-11-15 or so.  Have
adjusted the target timeline a bit, after some considerations (and also
being relatively busy with other stuff too).  Anyway, one thing I
didn't fully take into account earlier - fan noise.  That system that
was in the colo - 1U unit, is comparatively noisy (I've gotten a bit
spoiled mostly not listening to fan noise of such volume - even though
it uses a fan and airflow design that mostly avoids tiny 1U high-RPM
fan(s) - it's still noisier than most typical desktop systems - but
less noisy than many typical 1U servers).  So, ... I adjusted my
(theoretical) plans a bit.  With wakeonlan, qemu-kvm live migration,
and wee bit of infrastructure (which I was mostly planning to do
anyway), and small bit of scripting, I could arrange to have the VM
running on the noisier (but higher availability) hardware, mostly only
when it wouldn't be running on my laptop at home.  And with live
migration, the migration would be effectively "invisible" to the guest
VM host itself, its state, connections to it and sessions on it, etc.
Anyway, fair bit closer to having that plan fully implemented.  Current
target timeline for completion, by 2015-11-29, or at least not later
than 2015-12-13.  May be fair bit sooner.  I'll update once it's in
place and fully operational (did get a fair chunk of related
infrastructure completed yesterday and today).
references/excerpts:
https://en.wikipedia.org/wiki/Wake-on-LAN
https://en.wikipedia.org/wiki/Live_migration
...
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
Subject: Re: sf-lug site & hardware
Date: Thu, 12 Nov 2015 14:01:41 -0800
...
FYI, this morning Jim Stockford and I did retrieve the physical server
host from the colo, upon which, up until some months back, the sf-lug
web site was running.  So, that improves the hardware resource
situation.  I'm guestimating I'll have the sf-lug website again running
on VM atop this hardware by sometime this weekend or so - that should
improve the availability a fair bit (notably the sf-lug website then
won't go down when my personal laptop goes down, offline, or out the
door from home).
Thanks Jim!
...
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
Subject: Re: Have you guys thought about http://www.freelists.org/  
(hosted ...)
Date: Wed, 11 Nov 2015 18:26:25 -0800
...
would be down or that it wasn't (relatively) high availability (at least
compared to virtual machine running on my personal laptop - which does
have the sf-lug site go out when my laptop goes out ... hopefully that
situation will be improved in near future ... waiting on some resources
to be able to do that.)
references/excerpts:
http://linuxmafia.com/pipermail/sf-lug/2015q4/011454.html
http://linuxmafia.com/pipermail/sf-lug/2015q4/011441.html
...
From: Shane Tzen shane@faultymonk.org
Date: Wed, 11 Nov 2015 15:56:14 -0800
Subject: Re: [sf-lug] updated/upgraded: SF-LUG - operating system  
presently hosting
To: Michael Paoli Michael.Paoli@cal.berkeley.edu
Cc: SF-LUG sf-lug@linuxmafia.com
Have you guys thought about http://www.freelists.org/about.html ?
Looks like various LUGs are hosted -
http://www.freelists.org/cat/Linux_and_UNIX
On Fri, Oct 30, 2015 at 3:41 AM, Michael Paoli <
Michael.Paoli@cal.berkeley.edu> wrote:
...
It's been updated/upgraded:
from: Debian GNU/Linux 7.9 (wheezy)
to: Debian GNU/Linux 8.2 (jessie)
http://lists.balug.org/pipermail/balug-admin-balug.org/2015-October/002989.h...
Still definitely *not* high availability though (alas, still sits atop
a virtual machine on my *laptop*!).
Hopefully in not too horribly distant future (like *real soon*), the
physical box the site was earlier running upon will be successfully
retrieved - once that happens, some high(er) availability options
become possible.
Let me know if you notice anything awry (notwithstanding the less than
high availability).
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
...
Subject: It's alive*!: Re: SF-LUG - DNS, web site, ..., etc.
Date: Mon, 24 Aug 2015 03:10:26 -0700
Anyway, have taken the liberty ...
...
it's alive* ...
the [www.]sf-lug.{org,com}
websites are available again.
----- End forwarded message -----
...
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
Subject: BALUG ancillary sites UPgraded
Date: Fri, 30 Oct 2015 03:33:52 -0700
...
Upgraded (updated) from: Debian GNU/Linux 7.9 (wheezy)
to: Debian GNU/Linux 8.2 (jessie)
If you notice anything awry, let me know.
I did notice the wiki pages do have a rather different look & feel.
...
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
Subject: BALUG ancillary sites UP: Re: Of relevance for BALUG  
ancillary sites
Date: Sat, 22 Aug 2015 03:00:16 -0700
...
These BALUG sites are UP again.
Due note, however, that the new location is *not* high availability,
nor high bandwidth.
Specifically, at present, the VM hosting those sites has been move onto
a laptop (8-O !) ... which does occasionally get out and about (and when
it does, those BALUG sites will be temporarily down again), and it's
sitting at the end of a DSL line - which also shares bandwidth with
other hosts.
I'll likely be doing some things in the coming weeks or more to improve
the availability (notably move off the present laptop where the VM is
currently), and may at some future point also have improved bandwidth -
but for now, is what it is.
Let me know if anyone notices anything particularly amiss (notwihstanding
the inherent limitations and availability limitations noted above).
...
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
Subject: BALUG ancillary sites DOWN: Re: Of relevance for BALUG  
ancillary sites
Date: Thu, 20 Aug 2015 09:05:43 -0700
...
The BALUG ancillary sites were still up yesterday morning, but they're
DOWN now.  I'm presuming power/network has been removed from the "vicki"
physical host at the colo site.  I expect there will be more about this
on the SF-LUG list *real soon now*.
http://linuxmafia.com/mailman/listinfo/sf-lug
I do have recent backups - having reasonably anticipated this scenario,
I have backups from this as recent as yesterday morning.
...
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
Subject: Of relevance for BALUG ancillary sites
Date: Sat, 25 Jul 2015 01:30:48 -0700
...
Just a bit of background.  Some BALUG sites may be impacted soon,  
but *not*
www.balug.org or balug.org itself.
See:
http://lists.balug.org/pipermail/balug-admin-balug.org/2014-July/002233.html
for *which* sites will (and won't) be impacted
As to some background and why, have a look at:
http://linuxmafia.com/pipermail/sf-lug/2015q3/011330.html
And, for the BALUG specific bits I left out of the above:
...
From: "Michael Paoli" Michael.Paoli@cal.berkeley.edu
Subject: Re: vicki (exodus) & GoGrid/Datapipe
Date: Thu, 23 Jul 2015 03:14:08 -0700
...
BALUG: what balug has on the virtual machine atop vicki is certainly
highly useful and convenient, and would be relative pain to be without,
but we could survive without, at least for a bit, and have in fact
relocated it (temporarily) elsewhere in past.
What BALUG does *not* have atop the (VM on the) vicki host:
www.balug.org, and balug.org itself aren't on that host, but other
sites/subdomains are, e.g. [www.]{archive,wiki}.balug.org, among others.
DNS - only for some subdomains, again not balug.org. itself.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

[BALUG-Admin] High(er) availability for BALUG ancillary sites