Some web stats by domains, etc. I was interested, thought others might be curious/interested too, so ... I've got the log rotation set up so it retains bit over a year's worth of web server logs $ awk '{if($1=="rotate" || $1 !~ /^#/ && $1 ~ /ly$/)print}' /etc/logrotate.d/apache2 weekly rotate 60 so, fair bit of data available. This web server runs on the balug Virtual Machine (VM) host, which covers for not only the BALUG [Linux] User Group ([L]UG), but others too. Anyway, grabbed the data from the logs, and did a wee bit of analysis. This is from the {ssl_,}access.log* files.
So, basically analyzed domain and port, here's the information/analysis, and counts, from the highest levels on down to that:
13945791 Total That's the total traffic seen, all sites/domains that got logged. That's approximately: 232430 per week (13945791/60) 33204 per day (13945791/60/7) 1384 per hour (13945791/60/7/24) 23 per minute (13945791/60/7/24/60) 0.38 per second (13945791/60/7/24/60/60) So about one per 2.6 seconds (1/(13945791/60/7/24/60/60))
by [L]UG or project or whatever and the like. Note also that test traffic isn't excluded, so, e.g. some domains that aren't in DNS or so delegated (or not yet or no longer in DNS) also show wee bit 'o traffic. 9504843 BALUG 3463857 BerkeleyLUG 968724 SF-LUG 8283 digitalwitness.org 51 BAD 33 BUUG
In a bit more detail, by relevant TLD: 9504843 balug.org 3457611 berkeleylug.com 896560 sf-lug.org 40847 sf-lug.com 13698 sflug.org 8283 digitalwitness.org 7458 sflug.net 6246 berkeleylug.org 6084 sflug.com 4077 sf-lug.net 51 bad.debian.net 33 buug.org Note that for all of SF-LUG's various domains, traffic drops by >20x once we leave the canonical TLD ([www.]sf-lug.org) and drop to 1st runner up in SF-LUG traffic, >65x to 2nd runner up, and
120x for 3rd runner up.
For BerkeleyLUG, the obscure and almost completely unknown (was never canonical nor particularly promoted) BerkeleyLUG.org, The canonical BerkeleyLUG.com has >550x the traffic ... however BerkeleyLUG.org no longer exists, so it was something >1/550 of the traffic when it still existed (but even back then I recall it being relatively negligible portion of traffic).
So, by domain ... secure.balug.org is somewhat surprising. It's probably mostly from "bad" or not so well behaved bots, poking at the BALUG wiki, trying to login there, and getting redirected ... or maybe such bots thinking there's something more "interesting" to go after because it's got "secure" in the name? Maybe ought phase that one out, not nearly so relevant anymore. I think the original idea is that one would redirect to force https, but I believe now all the domains offer https, and as/where relevant (e.g. wiki login and after doing so with cookie that has authenticated state) redirection can be done from http to https with same domain - no need for a separate domain for that. Also a bit surprising too, is BerkeleyLUG.com being as high as it is. It's WordPress, and bots - legitimate, or not so, and search engines, well, WordPress effectively "expands" to a quite huge set of unique URLs, even if the content of each isn't all that incredibly unique. E.g. I remember earlier trying to crawl the site when it was still hosted by WordPress.com - it expanded into a huge amount of content that wasn't particularly feasible to replicate in-place as-is without WordPress - especially relatively to the actual full data in/behind the site - a much more manageable and smaller set of data. So that may, at least partially, explain the somewhat surprising large number there. Likewise, BALUG's wiki - lots of content - especially if one crawls the archive of all older versions of all pages. And BALUG's list - every posting can be individually crawled, so, lots of URLs, and probably keeps search engines / bots relatively busy. And one other that's slightly surprising ... the mx entries. Really noting promoting - or even configured for that as web, other than it existing in DNS. So that's probably mostly "bad" bots and/or a bit of test traffic. Also, pi.berkeleylug.com is slightly, but not entirely redundant with https://berkeleylug.com/Pi.BerkeleyLUG/ Notably pi.berkeleylug.com 302 redirects to the above ... however also, there's account and sudo and dynamic DNS highly available to pi.berkelelylug - so if/whenever they / that SIG wants it to go to somewhere else in DNS, it's readily available for that. But if the DNS & web traffic hits the BALUG host's web server, it 302 ("temporary") redirects as noted above. 4467098 secure.balug.org 3408433 berkeleylug.com 2083824 www.balug.org 1692238 www.wiki.balug.org 998636 lists.balug.org 875806 www.sf-lug.org 185888 balug.org 47829 www.berkeleylug.com 42223 www.archive.balug.org 23561 www.sf-lug.com 17286 sf-lug.com 15402 sf-lug.org 14875 old-debian.balug.org 8993 sflug.org 5372 www.digitalwitness.org 5175 wiki.balug.org 5047 berkeleylug.org 4705 www.sflug.org 4295 sflug.com 4052 sflug.net 3841 www.ipv4.sf-lug.org 3406 www.sflug.net 3291 www.new.balug.org 3291 sf-lug.net 2911 digitalwitness.org 2819 www.test.balug.org 2519 www.ipv4.balug.org 2016 www.beta.balug.org 1789 www.sflug.com 1225 www.php.test.balug.org 1199 www.berkeleylug.org 1182 ipv4.sf-lug.org 1096 mx.balug.org 966 ipv4.balug.org 786 www.sf-lug.net 753 www.pi.berkeleylug.com 611 mx.lists.balug.org 596 pi.berkeleylug.com 266 www.ipv6.balug.org 262 www.ipv6.sf-lug.org 77 ipv6.balug.org 67 ipv6.sf-lug.org 51 bad.debian.net 25 www.buug.org 8 buug.org
And, nothing horribly surprising here, given the above. This is essentially same again, but broken down by by port. Also, if we total up all of http (:80) and https (:443) we get: 7263155 :80 (http) 6682636 :443 (https) And by domain and port: 3570578 secure.balug.org:80 2942325 berkeleylug.com:443 1762326 www.balug.org:443 1355057 www.wiki.balug.org:80 896520 secure.balug.org:443 834027 www.sf-lug.org:80 616148 lists.balug.org:443 466108 berkeleylug.com:80 382488 lists.balug.org:80 337181 www.wiki.balug.org:443 321498 www.balug.org:80 140573 balug.org:80 45315 balug.org:443 43816 www.berkeleylug.com:80 41779 www.sf-lug.org:443 29510 www.archive.balug.org:80 22723 www.sf-lug.com:80 15730 sf-lug.com:80 14827 old-debian.balug.org:80 12713 www.archive.balug.org:443 11427 sf-lug.org:80 6618 sflug.org:80 4604 berkeleylug.org:80 4420 www.digitalwitness.org:80 4134 wiki.balug.org:80 4013 www.berkeleylug.com:443 3975 sf-lug.org:443 3548 sflug.com:80 3263 sflug.net:80 3251 www.sflug.org:80 2738 sf-lug.net:80 2485 digitalwitness.org:80 2462 www.test.balug.org:80 2394 www.ipv4.sf-lug.org:80 2375 sflug.org:443 2244 www.new.balug.org:80 1964 www.sflug.net:80 1767 www.beta.balug.org:80 1556 sf-lug.com:443 1454 www.sflug.org:443 1447 www.ipv4.sf-lug.org:443 1442 www.sflug.net:443 1409 www.sflug.com:80 1392 www.ipv4.balug.org:443 1127 www.ipv4.balug.org:80 1047 www.new.balug.org:443 1041 wiki.balug.org:443 975 www.php.test.balug.org:80 952 www.digitalwitness.org:443 879 ipv4.sf-lug.org:80 838 www.sf-lug.com:443 789 sflug.net:443 787 www.berkeleylug.org:80 747 sflug.com:443 736 ipv4.balug.org:80 718 www.pi.berkeleylug.com:80 599 mx.balug.org:80 553 sf-lug.net:443 497 mx.balug.org:443 463 www.sf-lug.net:80 443 berkeleylug.org:443 426 digitalwitness.org:443 412 www.berkeleylug.org:443 381 mx.lists.balug.org:80 380 www.sflug.com:443 376 pi.berkeleylug.com:80 357 www.test.balug.org:443 323 www.sf-lug.net:443 303 ipv4.sf-lug.org:443 250 www.php.test.balug.org:443 249 www.beta.balug.org:443 230 mx.lists.balug.org:443 230 ipv4.balug.org:443 220 pi.berkeleylug.com:443 194 www.ipv6.sf-lug.org:80 173 www.ipv6.balug.org:80 93 www.ipv6.balug.org:443 68 www.ipv6.sf-lug.org:443 67 ipv6.balug.org:443 60 ipv6.sf-lug.org:443 48 old-debian.balug.org:443 45 bad.debian.net:80 35 www.pi.berkeleylug.com:443 18 www.buug.org:80 10 ipv6.balug.org:80 7 www.buug.org:443 7 ipv6.sf-lug.org:80 6 bad.debian.net:443 4 buug.org:80 4 buug.org:443
Anyway, maybe some day I'll get some "real" web reporting in place. It is on the todo list, but ... $ wc todo 6156 22307 188878 todo $ It's also not a FIFO list, nor FILO. It's mostly a priority interrupt driven stack that gets a lot of reordering applied to it, and tends to grow much more than it shrinks. Well, at least I don't have to worry about running out of stuff to do - have well over lifetime's worth 'o stuff on the list. More BALUG and [L]UG specific lists may be found on BALUG's wiki, though they're not necessarily complete nor current.