[BALUG-Admin] Mailman, web scraping, case-preserved email addresses

Michael Paoli Michael.Paoli@cal.berkeley.edu
Sat Jul 29 04:44:42 PDT 2017


And turns out I can also access and save the case-preserved
email bits.  Turns out it only explicitly shows that on the pages
of members that entered their email address using one or more uppercase
characters within.

Yeah, I know, RFCs & Internet email addresses, it is
(*strongly*?) recommended the local part be treated as
case-insensitive, but it is not *required* to be treated
as case-insensitive.  And of course the domain part (DNS and all that)
is case-insensitive (though DNS generally *preserves* case, but is
mostly insensitive to case for matching).  And "of course",
users/members/humans may wish to (generally) see/show their
email address as or in form with mixed case, though too, they may
often wish to type it in lowercase only).  Interestingly, overall
common style and general practices tend to change(/evolve?)
over time - I think all lowercase is becoming increasingly
common for email addresses in entry/presentation.  (And
other random bits, like email instead of e-mail,
using . to separate portions of phone number rather than -).

> From: "Michael Paoli" <Michael.Paoli@cal.berkeley.edu>
> Subject: Re: lot of really nice command-line tools to use in dumping  
> and editing various pck databases
> Date: Fri, 28 Jul 2017 20:41:44 -0700

> I've got nice web scraping bit (had a wee bit more time this A.M.),
> so now I can grab *all* the member data *except* for their password
> (and the relatively trivial bit that Mailman also saves their
> email as subscriber enters/submits it - which might contain
> uppercase letters - but Mailman canonicalizes it to lowercase
> and mostly or entirely uses that - though it does also save the original
> form).  At least I think that's *all* the data - I'll double-check that




More information about the BALUG-Admin mailing list