Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Locked out of machine after upgrade
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
Sainted_Trousers
n00b
n00b


Joined: 27 Nov 2005
Posts: 42

PostPosted: Thu Dec 08, 2011 9:25 am    Post subject: Locked out of machine after upgrade Reply with quote

Pardon my long post, and bless your heart for reading it.

A few days ago, I upgraded (emerge -puD system and world), which included OpenRC from v0.8 to v0.9. I did the etc-update thing and kept most of the cfg changes. I'm not sure from where I got the idea now, but I checked my sysinit startup scripts. I found devfs service was started. I couldn't figure out why it was there, since devfs support in the kernel was supposed to be gone (maybe this is a different devfs?). I took it out of sysinit. Shut down...

Next day, the rc scripts throw yellow asterisks and complain about deprecated "opts" and the use of shell arrays in my conf.d files. One complaint was from net.eth0, the other from udev (I think). But the system booted fine, so I log in, fire up X, and everything works. I fix the complaints in the conf.d files. But my Eterm wouldn't give me a prompt - just something about "Task completed" in the title bar and "hit any key to exit..." inside the Eterm window. When I hit a key, the window disappears. I looked through some old forum posts about this problem from six or eight years ago, and after no joy re-installing Eterm and bash, etc-update reported nothing to update, and revdep-rebuild indicated no breakage, I followed the advice of the old posts and unmerged udev, moved /etc/udev to /etc/udev-bad, and reinstalled udev. I also checked /dev/null, and it had the correct permissions (0666). Added devfs back to sysinit. Still no joy, but I thought maybe a reboot may be necessary.

After the reboot, I can't login (I'm not using XDM/GDM/KDM or the sort - I manually start X with startx, and no policy kits or anything of the sort - I manage my own mounts). I can see that all the services have green asterisks - no yellow or red ones. But I can't login as a normal user, nor as root. I'm locked out. My attempt at single-user failed, since init still fires up six pseudo consoles + gettys asking for logins as usual. Maybe I didn't correctly change the kernel command line. I can't ssh, because I'm not running that service.

A (perhaps) useful insight is that after attempting root logins several times, the console says that the last time I logged in was just a few moments prior. That is, apparently it's logging me in, but kicking me off a few seconds later (apparently, it likes to get a good run at it). So it doesn't sound like a password problem (it temporarily logs me in), or a permissions problem in /dev (not even root can log in). This sounds like the Eterm/bash problem, above, but with my login/console shell.

So, you know... this sucks, right?

I haven't had this Gentoo install long enough to even change the oil. Please don't make me reconfigure a new kernel :(
Back to top
View user's profile Send private message
sirlark
Guru
Guru


Joined: 25 Oct 2004
Posts: 306
Location: Limerick, Ireland

PostPosted: Thu Dec 08, 2011 1:35 pm    Post subject: Reply with quote

Is it possible that it's a permission problem on your console devices?

One way to check, is to boot off a live CD and chroot into the system. That will be using the devices from the liveCD but the bash from your system. If you can log in and type something, then it's not bash, it's the devices. Whilst you're there, do yourself a favour and enable sshd ;)

Let us know about the result
_________________
Adopt an unanswered post today
Back to top
View user's profile Send private message
Sainted_Trousers
n00b
n00b


Joined: 27 Nov 2005
Posts: 42

PostPosted: Thu Dec 08, 2011 7:48 pm    Post subject: Reply with quote

Thanks for replying, and I apologize for the delay.

I booted a Gentoo minimum install CD, then followed the Install Guide for chrooting, to wit:

Code:
# mount /dev/<broken_root> /mnt/gentoo
# cp -L /etc/resolv.conf /mnt/gentoo/etc          # just in case
# mount -t proc none /mnt/gentoo/proc
# mount --rbind /dev /mnt/gentoo/dev
# chroot /mnt/gentoo /bin/bash
# ls /

< contents of install CD, not the broken root >


So the chroot failed. Could it be absent dev files, rather than bad permissions? Is it bash, then? I reinstalled bash before my last reboot (see my first post), and it didn't fix anything. And anyway, unless I can chroot into the broken system, my options are really limited. About all I can do, apart from the nuclear option, is edit text files and possibly fiddle with the services.

I'm don't know how to enable sshd - it's there, in the broken system's /etc/init.d, but how do I start it at boot? Do I put a link to the script somewhere?

I'm not sure what ssh buys me, anyway. Doesn't sshd run bash and pass commands to it and echo results back to me, like Eterm does? Shouldn't I expect the same failures?

Anyway, thanks again.
Back to top
View user's profile Send private message
sirlark
Guru
Guru


Joined: 25 Oct 2004
Posts: 306
Location: Limerick, Ireland

PostPosted: Fri Dec 09, 2011 7:48 am    Post subject: Reply with quote

Sainted_Trousers wrote:
So I assume the chroot failed. Could it be absent dev files, rather than bad permissions?


No, not absent dev files, because in this case /dev in the chroot is populated from the liveCDs /dev. And the permissions would be inherited from the liveCD /dev too, so it's doesn't look like the issue is devfs, udev, or anything else /dev related.

Quote:
Is it bash, then? I reinstalled bash before my last reboot (see my first post), so I'm not sure what else I can do to fix it. Bash configuration files, or something? Should I reinstall it again?

I'm not sure I can enable sshd - I never installed it, so unless it's part of the stage3 tarball or suchlike, I would have to emerge it onto a broken system (yikes; one problem at a time). I'm not sure what ssh buys me, anyway. Does sshd not run a shell and pass commands to it and echo results back to me, like Eterm does? Shouldn't I expect the same failures?


Is it possible that your /etc/passwd file has been corrupted, especially the shell field? Also, can you try booting into the liveCD, mounting <broken root> and executing /mnt/gentoo/bin/bash and pasting the result.

As far as ssh is concerned, it should be installed as part of system iirc. Although, if you can't chroot into your broken root, then you can't run rc-update to update it, you'll have to sort it out manually. The add a link to "/etc/init.d/sshd" to "/mnt/gentoo/etc/runlevels/default". Given that chroot doesn't work, I don't expect ssh to work either, but it would be interesting to see.

BTW, is there any info in /var/log/messages or /var/log/dmesg? You should be able to see it from the liveCD
_________________
Adopt an unanswered post today
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21633

PostPosted: Sat Dec 10, 2011 1:42 am    Post subject: Reply with quote

If you have dev-util/strace available, it would be interesting to see the output of strace chroot /mnt/gentoo /bin/bash.
Back to top
View user's profile Send private message
Sainted_Trousers
n00b
n00b


Joined: 27 Nov 2005
Posts: 42

PostPosted: Sat Dec 10, 2011 4:08 am    Post subject: Reply with quote

Sorry again for the delay - the fan sent poo flying everywhere. Perhaps the lavy would be a better option. :)

Looked at /mnt/gentoo/etc/passwd - not much there, and everything seemed in order. Both root and user account had /bin/bash for the shell, and /mnt/gentoo/etc/shadow had mangled passwords for both.
Code:
# /mnt/gentoo/bin/bash
# ps aux
< snip >
root   . . .   /mnt/gentoo/bin/bash
root   . . .   ps aux

So the broken system's bash works fine.

Inre sshd, I checked and it is in /mnt/gentoo/etc/init.d - I put a link to it in /mnt/gentoo/etc/runlevels/default, but the symbolic link specifies /etc/init.d/sshd, as it should. I expect the service to be there at next boot.

The contents of /mnt/gentoo/var/log/messages was interesting. Here's the pertinent lines close to the bottom.
Code:
Dec  8 01:33:50 torii start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  8 01:34:19 torii login[1861]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)
Dec  8 01:34:19 torii login[1867]: ROOT LOGIN  on '/dev/tty1'
Dec  8 01:34:19 torii login[1861]: pam_unix(login:session): session closed for user root
Dec  8 01:34:45 torii shutdown[1869]: shutting down for system reboot     <-- I hit ctl-alt-del at this point

I know vaguely how pam works, but have never manually configured it before. One of those old posts I mentioned that recommended a reinstall of udev also recommended a reinstall of pam at the same time. I can't recall if my last system/world update included pam, but it's a fair bet. I etc-update'd like 17 configs at the time, and a lot of it is a blur - maybe pam was in there, and maybe not. BTW, user nobody has /bin/false for the shell entry in the passwd file.

I really don't trust myself to fiddle with pam config files, and though I could probably find the files, I wouldn't know where to look inside them, or what I'd be looking for.

Watchman Hu: strace was not in /mnt/gentoo/[bin, usr/bin, sbin, usr/sbin], and it wasn't on the minimum install CD I'm using to rescue my system. I've only got a handful of distro disks on hand, and none of them are specifically rescue disks, so I doubt any of them will have such a specialty (developer/post-mortem) item. But I'll check, and if one does, I'll dump the output. Could you recommend a good rescue disk that would have it? Does Gentoo make one? I'll burn it regardless of what happens here. Thanks.

Which reminds me - I have two Gentoo minimum install CDs, one from Jan 2011 and one from July 2011, and neither of them recognize my USB keyboard (vanilla MS Natural Ergonomic 4000) until I unplug it and plug it back in. This means a) the USB plug is difficult to reach, and my joints aren't what they used to be, and b) I have to loadkeys my layout rather than selecting it from the startup nag. What do you think? Should I bug somebody (else) over in the Install forum, or am I just whining?
Back to top
View user's profile Send private message
Sainted_Trousers
n00b
n00b


Joined: 27 Nov 2005
Posts: 42

PostPosted: Mon Dec 12, 2011 9:26 pm    Post subject: Reply with quote

Looking through the /var/log/messages, I find fishy goings-on with PAM and user 'nobody', who has not appeared anywhere else in my logs going back three weeks. Following is a brief post-mortem timeline. BTW, I don't have an initramfs disk or anything of the sort.

Day 1:
Day after system update, and after a reboot. Module pam_unix starts a session for user nobody less than a microsecond after the kernel starts... long before pamd is started, but the pam_unix module is taking credit for it at the behest of start-stop-daemon (invoked from /etc/init.d scripts, and configured in /etc/pam.d). After kernel initialization, init takes over and another nobody session is started. Two persistent root logins are permitted, and a normal login, not shown. User nobody is never logged out.

Code:
Dec  5 16:51:12 circus kernel: [    0.000000]   4 disabled
Dec  5 16:51:12 circus start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  5 16:51:12 circus kernel: [    0.000000]   5 disabled
--- init starts ---
Dec  5 16:51:17 circus start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  5 16:51:17 circus cron[1847]: (CRON) STARTUP (V5.0)
Dec  5 16:51:22 circus kernel: [   20.762005] eth0: no IPv6 routers present
Dec  5 16:52:04 circus login[1860]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)
Dec  5 16:52:04 circus login[1866]: ROOT LOGIN  on '/dev/tty1'
Dec  5 16:53:30 circus login[1861]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)
Dec  5 16:53:30 circus login[1870]: ROOT LOGIN  on '/dev/tty2'

Day 2:
Another session for nobody, again early in kernel initialization, before pamd starts. No root logins attempted, just a normal user - the login was successful and persistent.
Code:
Dec  6 23:58:53 circus kernel: [    0.082541] pnp 00:04: Plug and Play ACPI device, IDs PNP0b00 (active)
Dec  6 23:58:53 circus start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  6 23:58:53 circus kernel: [    0.082549] pnp 00:05: [io  0x0061]
--- init starts ---
Dec  6 23:58:58 circus start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  6 23:59:04 circus kernel: [   20.610004] eth0: no IPv6 routers present
Dec  6 23:59:26 circus login[1845]: pam_unix(login:session): session opened for user trousers by LOGIN(uid=0)

Day 3:
Another session for nobody, before pamd starts. Now the headaches begin. Normal user login (trousers) sessions opened and immediately closed. Root login sessions opened and immediately closed. I'm locked out.
Code:
Dec  7 23:57:12 circus kernel: [    0.081787] pnp 00:00: [io  0x1c00-0x1c80 window]
Dec  7 23:57:12 circus start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  7 23:57:12 circus kernel: [    0.081858] pnp 00:00: Plug and Play ACPI device, IDs PNP0a08 PNP0a03 (active)
--- init starts ---
Dec  7 23:57:17 circus start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  7 23:57:17 circus cron[1868]: (CRON) STARTUP (V5.0)
Dec  7 23:57:22 circus kernel: [   20.066006] eth0: no IPv6 routers present
Dec  7 23:58:47 circus login[1881]: pam_unix(login:session): session opened for user trousers by LOGIN(uid=0)
Dec  7 23:58:48 circus login[1881]: pam_unix(login:session): session closed for user trousers
Dec  7 23:59:01 circus cron[1891]: (root) CMD (rm -f /var/spool/cron/lastrun/cron.hourly)
Dec  7 23:59:13 circus login[1889]: pam_unix(login:session): session opened for user trousers by LOGIN(uid=0)
Dec  7 23:59:13 circus login[1889]: pam_unix(login:session): session closed for user trousers
Dec  7 23:59:36 circus login[1894]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)
Dec  7 23:59:36 circus login[1895]: ROOT LOGIN  on '/dev/tty1'
Dec  7 23:59:36 circus login[1894]: pam_unix(login:session): session closed for user root
Dec  8 00:00:01 circus cron[1898]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons )
--- more cron jobs ---
Dec  8 01:10:01 circus cron[1985]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons )
Dec  8 01:13:43 circus login[1896]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)
Dec  8 01:13:43 circus login[1996]: ROOT LOGIN  on '/dev/tty1'
Dec  8 01:13:43 circus login[1896]: pam_unix(login:session): session closed for user root
Dec  8 01:14:00 circus login[1997]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)
Dec  8 01:14:00 circus login[1998]: ROOT LOGIN  on '/dev/tty1'
Dec  8 01:14:00 circus login[1997]: pam_unix(login:session): session closed for user root
Dec  8 01:14:18 circus login[1999]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)
Dec  8 01:14:18 circus login[2000]: ROOT LOGIN  on '/dev/tty1'
Dec  8 01:14:18 circus login[1999]: pam_unix(login:session): session closed for user root
Dec  8 01:20:01 circus cron[2003]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons )
Dec  8 01:30:01 circus cron[2015]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons )
Dec  8 01:32:15 circus shutdown[2026]: shutting down for system reboot

Reboot same day, same observances. Skip to...

Day 4:
Only notable difference is that user nobody has three sessions opened; one during kernel initialization, and two more after init starts.

Code:
Dec  9 20:52:20 circus kernel: [    0.000000]   4 disabled
Dec  9 20:52:20 circus start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  9 20:52:20 circus kernel: [    0.000000]   5 disabled
--- init starts ---
Dec  9 20:52:27 circus start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  9 20:52:27 circus sshd[1860]: Server listening on 0.0.0.0 port 22.
Dec  9 20:52:27 circus sshd[1860]: Server listening on :: port 22.
Dec  9 20:52:27 circus start-stop-daemon: pam_unix(start-stop-daemon:session): session opened for user nobody by (uid=0)
Dec  9 20:52:27 circus cron[1885]: (CRON) STARTUP (V5.0)
Dec  9 20:52:30 circus kernel: [   18.978013] eth0: no IPv6 routers present
Dec  9 20:53:26 circus login[1898]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)
Dec  9 20:53:26 circus login[1904]: ROOT LOGIN  on '/dev/tty1'
Dec  9 20:53:26 circus login[1898]: pam_unix(login:session): session closed for user root
Dec  9 20:59:01 circus cron[1907]: (root) CMD (rm -f /var/spool/cron/lastrun/cron.hourly)

Okay, the upshot of all this: GRUB is apparently pulling in pam_unix and /etc/security/start-stop-daemon along with the kernel, since the kernel hasn't even recognized the SATA controllers at the point when 'nobody' is granted a session. My logs are either correct/accurate or incorrect/inaccurate. If correct, grub/kernel/init/pam are seriously screwed up. If incorrect, either someone is messing with the logs (I'm owned) or the logging daemon/init/pam are whacked (nobody logins have not occured until now, and even if they are legitimately happening during OpenRC system initialization, the log of those events are getting mixed up with kernel initialization).

I see a lot of WTF moments here, and IMO, the system is beyond repair before a reasonable deadline, if at all. However, if someone feels there is a forensic value to the pursuit, I will keep the system in limbo and respond to inquiries and suggestions. But if I don't hear from anybody in the next day or two, I'll nuke it and do a clean install (ack).

Thank you sirlark and Watchman Hu, in any event.
Back to top
View user's profile Send private message
sirlark
Guru
Guru


Joined: 25 Oct 2004
Posts: 306
Location: Limerick, Ireland

PostPosted: Tue Dec 13, 2011 7:15 am    Post subject: Reply with quote

I've just skimmed through my logs for something similar, and the only nobody references are for SAMBA, FTP, and SSH. My logs span 2 or 3 reboots at least. So this is definitely weird behavior. Maybe keep a copy of your logs and do a fresh install. I do think there might be useful info in your logs, and it should likely be reported to gentoo security as a bug, but I'm afraid it's way past my ability to help you. Sorry
_________________
Adopt an unanswered post today
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum