Hi,
I am experiencing a very strange thing for which there are no ready answers by googling. I am a volunteer for a non-profit which puts GNU-Linux computers in low income shelters. They are stand-alone machines connected directly to the Internet via a hub on a dedicated ethernet cable.
The shelters don't want the users to be able to store anything directly to the machine's hard drive. To give them that functionality, we ask them to use the guest session, which wipes out all data by default when the session ends.
Right now, however, we are experiencing a failure of logging into the guest session. Normally, you just choose the guest session in the Lubuntu login screen, and hit enter, and it boots up a full guest session. No password is required.
Now, when I chose the guest session and hit enter, the system appears to head toward a normal login, but then quickly fails and returns to the login screen.
The system's SU admin account is performing normally. To get into the admin account, I just choose it in the login screen, enter the password, and the admin session boots up normally.
This whole thing is very strange, and I have never seen anything like it before. We are using 14.04 on 13 machines with identical or similar hardware and are not having any such problems. This email is being written on one of those such machines, and the guest session works just fine.
I ran updates on the malfunctioning machines, rebooted, no joy.
Thanks very much in advance.
Christian Einfeldt writes:
Right now, however, we are experiencing a failure of logging into the guest session. Normally, you just choose the guest session in the Lubuntu login screen, and hit enter, and it boots up a full guest session. No password is required.
If the root account can log in (so you know the desktop is installed properly), and it starts to log in but bails out, I'd wonder if there's something in the guest account's .bashrc or .profile, or other login-time configuration files, that are either preventing login or causing an immediate logout.
Have you tried logging in on a console? On the machine, try typing ctrl-alt-F2 to get a text console, and try logging in as guest there. (Ctrl-alt-F1 or Ctrl-alt-F7 will probably get you back to X, but if not, try ctrl-alt- with all the function keys and one of them will probably work.) Or if sshd is set up, try sshing to guest@themachine from another machine and see if you can log in that way; if it isn't set up, log in as root and install openssh-server.
If guest can log in via ssh or text console, but not via X, then you know the problem has something to do with guest's desktop configuration. Try renaming files or directories or copying them from root to see what makes a difference.
If guest still can't log in even on a text console, that makes it a lot easier to debug. With any luck, it'll give you an error message before it logs you out. If the error message disappears too fast, then run script on another machine to record the session, then ssh to the machine. If there's no error message, then from your root login, try putting debug echo lines in guest's .bashrc and .profile, like echo starting .bashrc echo ending .bashrc and so forth; you can use those to see what files are being executed and how far it gets before things go bad.
Good luck!
...Akkana
Also be sure to inspect the X log:
cat /var/log/Xorg.0.log
(Compare between the machine that works vs the one that doesn't).
Cheers, Piotr
On 04/04/2017 12:00 PM, Akkana Peck wrote:
Christian Einfeldt writes:
Right now, however, we are experiencing a failure of logging into the guest session. Normally, you just choose the guest session in the Lubuntu login screen, and hit enter, and it boots up a full guest session. No password is required.
If the root account can log in (so you know the desktop is installed properly), and it starts to log in but bails out, I'd wonder if there's something in the guest account's .bashrc or .profile, or other login-time configuration files, that are either preventing login or causing an immediate logout.
Have you tried logging in on a console? On the machine, try typing ctrl-alt-F2 to get a text console, and try logging in as guest there. (Ctrl-alt-F1 or Ctrl-alt-F7 will probably get you back to X, but if not, try ctrl-alt- with all the function keys and one of them will probably work.) Or if sshd is set up, try sshing to guest@themachine from another machine and see if you can log in that way; if it isn't set up, log in as root and install openssh-server.
If guest can log in via ssh or text console, but not via X, then you know the problem has something to do with guest's desktop configuration. Try renaming files or directories or copying them from root to see what makes a difference.
If guest still can't log in even on a text console, that makes it a lot easier to debug. With any luck, it'll give you an error message before it logs you out. If the error message disappears too fast, then run script on another machine to record the session, then ssh to the machine. If there's no error message, then from your root login, try putting debug echo lines in guest's .bashrc and .profile, like echo starting .bashrc echo ending .bashrc and so forth; you can use those to see what files are being executed and how far it gets before things go bad.
Good luck!
...Akkana
BALUG-Talk mailing list BALUG-Talk@lists.balug.org http://lists.balug.org/listinfo.cgi/balug-talk-balug.org
Thanks, Akkana, I will give this a try!
I was not able to figure out how to log into a guest session from the CLI. Googling, I found this page:
https://gist.github.com/blaisck/fe748c8a4184e752556c
and tried it, but it actually switched me into a GUI environment, and then, coincidentally, I had to power down the machine to get out of that session. When I tried to use the logout button, it just didn't respond, and stayed in the GUI session. How do I log into the guest session from the CLI, please?
On Tue, Apr 4, 2017 at 12:00 PM, Akkana Peck akkana@shallowsky.com wrote:
Christian Einfeldt writes:
Right now, however, we are experiencing a failure of logging into the
guest
session. Normally, you just choose the guest session in the Lubuntu
login
screen, and hit enter, and it boots up a full guest session. No password is required.
If the root account can log in (so you know the desktop is installed properly), and it starts to log in but bails out, I'd wonder if there's something in the guest account's .bashrc or .profile, or other login-time configuration files, that are either preventing login or causing an immediate logout.
Have you tried logging in on a console? On the machine, try typing ctrl-alt-F2 to get a text console, and try logging in as guest there. (Ctrl-alt-F1 or Ctrl-alt-F7 will probably get you back to X, but if not, try ctrl-alt- with all the function keys and one of them will probably work.) Or if sshd is set up, try sshing to guest@themachine from another machine and see if you can log in that way; if it isn't set up, log in as root and install openssh-server.
If guest can log in via ssh or text console, but not via X, then you know the problem has something to do with guest's desktop configuration. Try renaming files or directories or copying them from root to see what makes a difference.
If guest still can't log in even on a text console, that makes it a lot easier to debug. With any luck, it'll give you an error message before it logs you out. If the error message disappears too fast, then run script on another machine to record the session, then ssh to the machine. If there's no error message, then from your root login, try putting debug echo lines in guest's .bashrc and .profile, like echo starting .bashrc echo ending .bashrc and so forth; you can use those to see what files are being executed and how far it gets before things go bad.
Good luck!
...Akkana
BALUG-Talk mailing list BALUG-Talk@lists.balug.org http://lists.balug.org/listinfo.cgi/balug-talk-balug.org
Christian Einfeldt writes:
Thanks, Akkana, I will give this a try!
I was not able to figure out how to log into a guest session from the CLI. Googling, I found this page:
https://gist.github.com/blaisck/fe748c8a4184e752556c
and tried it, but it actually switched me into a GUI environment, and then, coincidentally, I had to power down the machine to get out of that session. When I tried to use the logout button, it just didn't respond, and stayed in the GUI session. How do I log into the guest session from the CLI, please?
Did you try the suggestions I already gave? Ctrl-alt-f2 or ssh? What happened?
...Akkana
Hi Akkana, bottom posting...
On Wed, Apr 5, 2017 at 5:46 PM, Akkana Peck akkana@shallowsky.com wrote:
Christian Einfeldt writes:
Thanks, Akkana, I will give this a try!
I was not able to figure out how to log into a guest session from the
CLI.
Googling, I found this page:
https://gist.github.com/blaisck/fe748c8a4184e752556c
and tried it, but it actually switched me into a GUI environment, and
then,
coincidentally, I had to power down the machine to get out of that session. When I tried to use the logout button, it just didn't respond, and stayed in the GUI session. How do I log into the guest session from the CLI, please?
Did you try the suggestions I already gave? Ctrl-alt-f2 or ssh? What happened?
Sorry, I forgot to tell you that I tried to use ctrl-alt-f4 but I didn't know what the name of the guest session was. I am not trying to log into the guest account, but rather the temp guest session. I would also have to do some basic learning as to how to use the ssh commands.
I actually ran out of time, and so cheated and just re-installed the whole OS. At any rate, I figured that would probably take care of any monkey business that someone might have installed on the machine. But I clearly still have a lot of learning to do. Thanks for your efforts in trying to help me!
On Mon, Apr 3, 2017 at 11:20 PM, Christian Einfeldt einfeldt@gmail.com wrote:
Hi,
I am experiencing a very strange thing for which there are no ready answers by googling. I am a volunteer for a non-profit which puts GNU-Linux computers in low income shelters. They are stand-alone machines connected directly to the Internet via a hub on a dedicated ethernet cable.
The shelters don't want the users to be able to store anything directly to the machine's hard drive. To give them that functionality, we ask them to use the guest session, which wipes out all data by default when the session ends.
Right now, however, we are experiencing a failure of logging into the guest session. Normally, you just choose the guest session in the Lubuntu login screen, and hit enter, and it boots up a full guest session. No password is required.
Now, when I chose the guest session and hit enter, the system appears to head toward a normal login, but then quickly fails and returns to the login screen.
The system's SU admin account is performing normally. To get into the admin account, I just choose it in the login screen, enter the password, and the admin session boots up normally.
This whole thing is very strange, and I have never seen anything like it before. We are using 14.04 on 13 machines with identical or similar hardware and are not having any such problems. This email is being written on one of those such machines, and the guest session works just fine.
I ran updates on the malfunctioning machines, rebooted, no joy.
Thanks very much in advance.
In case anyone was curious as to what happened with this, I finally had some time to sit down on site this evening and do some debugging.
Some background as to how the guest logins work in Lubuntu: A guest-XXXXX (random characters) user is created upon login, which is used throughout the session. It is then deleted when the user logs out.
After some red herrings in the auth logs (mostly PAM errors around KDE and Gnome keyrings), I did some digging in the lightdm logs. Eventually I noticed the UID of the guest account trying to be created was the same every time a login attempt was made: 999. Odd. So I looked in /etc/passwd and noticed that there were hundreds of guest-XXXXX accounts. That's no good!
Turns out, at some point the /etc/subgid.lock file got stuck in an existing state (wasn't deleted when the lock concluded), which meant the command to delete the user was not completing successfully upon logout. Users were piling up and never being deleted. Once the UIDs hit 999 it was failing to create new guest users, so the login would fail. I quick mv (rm didn't work) of the subgid.lock file and a script to delete all the guest accounts got us going again.
I'm considering my options to get us out of this reoccurring issue in the future. I'm thinking of just a cron job on each machine that checks for a subgid.lock file sticking around for more than a couple days and moving it out of the way, but I'll sleep on it. More clever suggestions welcome ;)
[meta] help?
Did a teensy bit 'o searching regarding /etc/subgid.lock on *Debian* - and found almost, but not quite nothing. Looks to be rather Lubuntu specific. The bit I did find on Debian related to Debian detecting difference in Lubuntu specific patch not in Debian. http://lists.alioth.debian.org/pipermail/pkg-shadow-devel/2016-February/0108... http://lists.alioth.debian.org/pipermail/pkg-shadow-devel/2016-February/0108... diff -pruN 1:4.2-3.1ubuntu1/debian/passwd.service 1:4.2-3.1ubuntu2/debian/passwd.service --- 1:4.2-3.1ubuntu1/debian/passwd.service 2016-02-01 20:35:12.000000000 +0000 +++ 1:4.2-3.1ubuntu2/debian/passwd.service 1970-01-01 00:00:00.000000000 +0000 @@ -1,10 +0,0 @@ -[Unit] -Description=Clear passwd locks - -[Service] -Type=oneshot -RemainAfterExit=yes -ExecStart=/bin/rm -f /etc/gshadow.lock /etc/shadow.lock /etc/passwd.lock /etc/group.lock /etc/subuid.lock /etc/subgid.lock - -[Install] -WantedBy=multi-user.target diff -pruN 1:4.2-3.1ubuntu1/debian/passwd.tmpfile 1:4.2-3.1ubuntu2/debian/passwd.tmpfile --- 1:4.2-3.1ubuntu1/debian/passwd.tmpfile 1970-01-01 00:00:00.000000000 +0000 +++ 1:4.2-3.1ubuntu2/debian/passwd.tmpfile 2016-02-04 22:00:50.000000000 +0000 @@ -0,0 +1,8 @@ +# If a password operation is in progress and we lose power, stale lockfiles +# can be left behind. Clear them on boot. +r! /etc/gshadow.lock +r! /etc/shadow.lock +r! /etc/passwd.lock +r! /etc/group.lock +r! /etc/subuid.lock +r! /etc/subgid.lock
Curious also, when the file was present, did you use fuser(1) or similar to see if anything still had the file open? Was the file also possibly non-empty in size, and possibly contained a PID or the like? And if so, did that PID still exist and what was it? Did fuser(1) or lsof(8) show any process having open for write/append, or having any lock on /etc/passwd, /etc/shadow, /etc/group, /etc/gshadow, or any of the lock files thereof?
More [meta] help ... been thinking of this for the "mom" computer, or similar for [great]grand{ma,dad}, etc. Notably where remote access - e.g. to troubleshoot/support, could be quite useful, but computer may typically be behind NAT/SNAT on The Internet and/or have dynamic IP(s), etc. - so not so feasible to run an ssh server there and just use ssh to access the computer. ... well, "reverse ssh" ... I remember someone asking the question a *long* time ago, and ... someone else's excellent response. I don't recall much of the detail of the response anymore, but key bit I do recall of it, was using ssh port forwarding, so, effectively, the computer would do a "call home" to some Internet accessible server ... which would then cause port to be opened on that Internet accessible server - one could then ssh to the desired target computer via that port. E.g., have done bit of proof of concept testing on that, and essentially works fine. On the target ("mom") computer, set up ssh server ... but configure it to only listen on localhost (127.0.0.1 and ::1). Set up account(s) as relevant (for the port forwarding, etc.), and generate ssh key pair on the target computer. Set up ~/.ssh/authorized_keys on (Internet accessible) server computer, with it configured using that public key and forced command and otherwise as restricted as feasible, so all it can do is set up the port forwarding - nothing else. Create suitable service(s) (e.g. crontab jobs or whatever) for the target computer to set up and generally maintain those port forwarded "phone home" ssh connections. The forwarding should be to localhost (127.0.0.1 and/or ::1) on the server computer. I'm presuming the target ("mom") computer may not have the world's best passwords - so don't want its ssh server exposed to The Internet ... even if it might sometimes be out somwhere where it gets a directly accessible Internet IP address. So ... I think folks can figure out the details from there ... essentially target creates tunnel, opening up some (>=1024) port on localhost of server, which is forwarded over ssh tunnel created by target, to port 22 of localhost on target. That's essentially it ... have tested it out fair bit, and works fine. So long as target can make outbound TCP connections to port 22 of server - even through NAT/SNAT - that will then generally allow ssh access to target via server. And, of course, once one can do that - and noting also that one can ssh into server, one then has means to ssh from The Internet into target (via server), and can also do, e.g. X11 forwarding, set up other software if/as needed on target or enter various commands/data (e.g. activate something to be able to view full desktop of target, etc.).
Anyway, thought, e.g. Partimus, and others, if they'd not thought of or considered that before, that type of "reverse ssh" can be quite useful for remote support, where NAT/SNAT, dynamic IPs, firewall(s), etc., would otherwise make it unfeasible/impossible to access the target via ssh from The Internet.
Also, if target can access port 443 on The Internet, but not port 22, there are ways of dealing with that ... at least if target isn't being forced through proxy at HTTP/HTTPS layers. I forget the name of the software/package, but, can even be done where both HTTPS web server and ssh effectively listen on port 443 ... there's bit of differences in the protocols - notably whether client or server "talks" first ... and ... that's enough to wedge a little bit of software on port 443 between https and ssh servers - and it can then get the traffic appropriately to either, depending on whether client is ssh or https.
From: "Elizabeth K. Joseph" lyz@princessleia.com Subject: Re: [BALUG-Talk] Lubuntu 14.04 guest session login failure Date: Sat, 8 Jul 2017 20:10:12 -0700
On Mon, Apr 3, 2017 at 11:20 PM, Christian Einfeldt einfeldt@gmail.com wrote:
Hi,
I am experiencing a very strange thing for which there are no ready answers by googling. I am a volunteer for a non-profit which puts GNU-Linux computers in low income shelters. They are stand-alone machines connected directly to the Internet via a hub on a dedicated ethernet cable.
The shelters don't want the users to be able to store anything directly to the machine's hard drive. To give them that functionality, we ask them to use the guest session, which wipes out all data by default when the session ends.
Right now, however, we are experiencing a failure of logging into the guest session. Normally, you just choose the guest session in the Lubuntu login screen, and hit enter, and it boots up a full guest session. No password is required.
Now, when I chose the guest session and hit enter, the system appears to head toward a normal login, but then quickly fails and returns to the login screen.
The system's SU admin account is performing normally. To get into the admin account, I just choose it in the login screen, enter the password, and the admin session boots up normally.
This whole thing is very strange, and I have never seen anything like it before. We are using 14.04 on 13 machines with identical or similar hardware and are not having any such problems. This email is being written on one of those such machines, and the guest session works just fine.
I ran updates on the malfunctioning machines, rebooted, no joy.
Thanks very much in advance.
In case anyone was curious as to what happened with this, I finally had some time to sit down on site this evening and do some debugging.
Some background as to how the guest logins work in Lubuntu: A guest-XXXXX (random characters) user is created upon login, which is used throughout the session. It is then deleted when the user logs out.
After some red herrings in the auth logs (mostly PAM errors around KDE and Gnome keyrings), I did some digging in the lightdm logs. Eventually I noticed the UID of the guest account trying to be created was the same every time a login attempt was made: 999. Odd. So I looked in /etc/passwd and noticed that there were hundreds of guest-XXXXX accounts. That's no good!
Turns out, at some point the /etc/subgid.lock file got stuck in an existing state (wasn't deleted when the lock concluded), which meant the command to delete the user was not completing successfully upon logout. Users were piling up and never being deleted. Once the UIDs hit 999 it was failing to create new guest users, so the login would fail. I quick mv (rm didn't work) of the subgid.lock file and a script to delete all the guest accounts got us going again.
I'm considering my options to get us out of this reoccurring issue in the future. I'm thinking of just a cron job on each machine that checks for a subgid.lock file sticking around for more than a couple days and moving it out of the way, but I'll sleep on it. More clever suggestions welcome ;)
On Sun, Jul 9, 2017 at 9:45 AM, Michael Paoli Michael.Paoli@cal.berkeley.edu wrote:
The bit I did find on Debian related to Debian detecting difference in Lubuntu specific patch not in Debian.
Thanks, this will be helpful for the bug report :)
Curious also, when the file was present, did you use fuser(1) or similar to see if anything still had the file open? Was the file also possibly non-empty in size, and possibly contained a PID or the like? And if so, did that PID still exist and what was it? Did fuser(1) or lsof(8) show any process having open for write/append, or having any lock on /etc/passwd, /etc/shadow, /etc/group, /etc/gshadow, or any of the lock files thereof?
In spite of the timestamp of the lock file being April, I still I confirmed with lsof that it was not being used by anything else. No other lock files were hanging around either.
More [meta] help ... been thinking of this for the "mom" computer, or similar for [great]grand{ma,dad}, etc. Notably where remote access - e.g. to troubleshoot/support, could be quite useful, but computer may typically be behind NAT/SNAT on The Internet and/or have dynamic IP(s), etc. - so not so feasible to run an ssh server there and just use ssh to access the computer.
Right, these systems are just on a shared network with an unsophisticated router controlling access, no easy point of access.
... well, "reverse ssh" ... I remember someone asking the question a *long* time ago, and ... someone else's excellent response. I don't recall much of the detail of the response anymore, but key bit I do recall of it, was using ssh port forwarding, so, effectively, the computer would do a "call home" to some Internet accessible server ... which would then cause port to be opened on that Internet accessible server - one could then ssh to the desired target computer via that port. E.g., have done bit of proof of concept testing on that, and essentially works fine.
Thanks, I'm familiar with this method and have used it before. I have considered it to avoid on-site visits in the case of an emergency, perhaps more feasible once we get our system in place to roll out custom Lubuntu ISOs on all the sites.
I will mention that in our current setup the emergency visits also serve as a good opportunity to check in on things otherwise, we had to swap out a mouse yesterday that we didn't know had been acting up. This doesn't reflect poorly upon them in any way, it's simply been my experience that unsophisticated users tend not to report problems, they work around them if they're not bad because they don't want to bother us, even when we tell them it's not a bother and that's what we're around for! Same thing happened at the schools we worked with, same thing happens with visiting relatives during holidays, "good grief, how long has it been doing this for?" ;)
Quoting Elizabeth K. Joseph (lyz@princessleia.com):
In case anyone was curious as to what happened with this, I finally had some time to sit down on site this evening and do some debugging.
Nice detective work.
Some background as to how the guest logins work in Lubuntu: A guest-XXXXX (random characters) user is created upon login, which is used throughout the session. It is then deleted when the user logs out.
After some red herrings in the auth logs (mostly PAM errors around KDE and Gnome keyrings), I did some digging in the lightdm logs. Eventually I noticed the UID of the guest account trying to be created was the same every time a login attempt was made: 999. Odd. So I looked in /etc/passwd and noticed that there were hundreds of guest-XXXXX accounts. That's no good!
Turns out, at some point the /etc/subgid.lock file got stuck in an existing state (wasn't deleted when the lock concluded), which meant the command to delete the user was not completing successfully upon logout. Users were piling up and never being deleted. Once the UIDs hit 999 it was failing to create new guest users, so the login would fail. I quick mv (rm didn't work) of the subgid.lock file and a script to delete all the guest accounts got us going again.
Next time you encounter that situation, I'd be curious what
rm -fv /etc/subgid.lock
reports. The '-f' is for force, which honestly won't help here, because all it does, IIRC, is force rm to omit error reporting if the target doesn't exist -- I think. The '-v' is probably more useful: verbose reporting of what rm encounters when it tries to take the requested action.
GNU rm is (again, IIRC) just a wrapper around the unlink(2) syscall, which removes a specified hardlink to an inode (/socket, FIFO, device), or in the case of a symlink, removes that. So, basically it's about the same as the unlink(1) command except a bit more featureful.
Ordinarily, I would expect 'rm' (or unlink) to fail only because either there's a read-only mount status in the way (obviously not the case, here), or hardware-level blocking (obviously not the case, here), or the immutable flag having been set (highly unlikely in this case), or ownership / rights issues. But I'm not going to hazard a guess, except, gremlins? ;-> I'm intrigued, anyway.
As you suggest, the real long-term fix is a bug report on someone's buggy code in useradd or in something calling adduser. I gather that the latter is a known problem: https://askubuntu.com/questions/459080/useradd-cannot-lock-etc-subuid-try-ag...
(Note someone's suggestion, in the cited case, that something might be running multiple instances of useradd simultaneously.)
However, that sort of contention over /etc/subgid.lock ought to show up in fuser / lsof, which you say doesn't check out -- so I'm back to being intrigued.
I'm considering my options to get us out of this reoccurring issue in the future. I'm thinking of just a cron job on each machine that checks for a subgid.lock file sticking around for more than a couple days and moving it out of the way, but I'll sleep on it. More clever suggestions welcome ;)
Well, not really. If the unlink syscall (basis of /bin/rm) isn't working, then I don't know of a different way of making the file completely go away. You might think of mv'ing it to a small filesystem (like a ramfs) and then blowing away and re-creating the filesystem -- but unfortunately /bin/mv uses rename() only when moving/renaming the file within the same filesystem. For a cross-filesystem mv, it instead does an unlink() followed in quick succession by a link() .
Not that you don't know this already, but a more-satisfactory solution would be to figure out what's bugging /bin/rm.