Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED?] EVMS: not a valid root device; Start udevd w/ldap
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 874

PostPosted: Mon Feb 27, 2006 5:37 pm    Post subject: [SOLVED?] EVMS: not a valid root device; Start udevd w/ldap Reply with quote

Hello,

I installed Gentoo on an Intel EM64T using the 2005.1-r1 AMD64 Universal CD with the default 2.6.12-r10.
I compiled it with genkernel --menuconfig --evms2 all
I selected ramdisk support, raid and device mapper.
Grub.conf contained:
title=INF-BL07 64bit EM64T NOCONA 2.6.12-r10 SCSI EVMS RAID1
root (hd0,0)
kernel /kernel-genkernel-x86_64-2.6.12-gentoo-r10 root=/dev/ram0 init=/linuxrc mem=4096M ramdisk=8192 doscsi vga=0 real_root=/dev/evms/root udev doevms2
initrd /initramfs-genkernel-x86_64-2.6.12-gentoo-r10

This system booted fine.

However, I made the "mistake?" of updating the whole system:
emerge --update --deep --newuse system
emerge --update --deep --newuse world

When I rebooted this system with the 2.6.12-r10 kernel, it "hung" on "Starting udev...".
If I pressed CTRL-C, the init process resumed but "hung again" on "Cleaning /tmp...".
So I commented out some lines in /etc/init.d/bootmisc (especially:
mkdir -p /tmp/.{ICE,X11}-unix
chown 0:0 /tmp/.{ICE,X11}-unix
chmod 1777 /tmp/.{ICE,X11}-unix
[[ -x /sbin/restorecon ]] && restorecon /tmp/.{ICE,X11}-unix
and all the >/dev/nulls)
and that allowed the system to boot. (I don't understand why)

I supposed that upgrading the whole system (baselayout 1.11.14-r5) without recompiling a recent kernel could have caused udev to "hang" so I recompiled the current 2.6.15-r5 kernel:
genkernel --menuconfig --evms2 all
I selected ramdisk support, raid and device mapper.
I updated grub.conf but when I rebooted I got these messages:
>> Activating udev OK
>> Activating EVMS OK
Determining root device...
Block device /dev/evms/root is not a valid root device
The root block device is unspecified or not detected.
Specify device to boot or "shell" for a shell.

So I "shelled"and noticed that even though evms_activate yields no errors/warnings, ls /dev/evms/ only lists "dm" and "dm/control".

Does anyone know why I can't see the /dev/evms/root or /dev/evms/.nodes/sda devices?

How could I debug? Any suggestions?


Last edited by Vieri on Tue Feb 28, 2006 6:53 pm; edited 1 time in total
Back to top
View user's profile Send private message
jschellhaass
Guru
Guru


Joined: 20 Jan 2004
Posts: 341

PostPosted: Mon Feb 27, 2006 10:12 pm    Post subject: Reply with quote

What type of SCSI card are you using to boot?

I believe genkernel only puts sata drivers in the initrd. You can try compiling the disk controller driver into the kernel instead of as a module.

jeff
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 874

PostPosted: Tue Feb 28, 2006 7:20 am    Post subject: Reply with quote

The SCSI cards are LSI 1020 Ultra320 (integrated, one channel).
Two SCSI disks are connected via a PERC 4/im RAID controller.
Will double-check whether the LSI is built into the kernel (actually, genkernel worked fine for 2.6.12 - the problem has arisen for 2.6.15, oddly).
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 874

PostPosted: Tue Feb 28, 2006 9:46 am    Post subject: Reply with quote

I tried installing Gentoo with the new 2006.0 amd64 image on a Dell PowerEdge 1855 EM64T. This system only has a USB CD drive. 2006.0 and 2005.1 could not find/detect it. 2005.1-r1 detected it as /dev/sr0 and booted just fine.
It seems that the enhancements made to 2005.1-r1 were not propagated to 2006.0...
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 874

PostPosted: Tue Feb 28, 2006 12:36 pm    Post subject: Reply with quote

Just in case someone has the same problem, here's how I pinpointed mine (thanks to the evms mailing list).
- the evms not detecting the disk was due to the fact that the scsi adapter was not built in the kernel (was wrongly assuming 2.6.15 had more or less same defaults as 2.6.12; if you have the same system, enable FUSION drivers in the kernel)
- the endless "Starting udevd..." was due to my "special" configuration and I suppose quite a few users may be in this situation. Authentication in my system is done via LDAP so nsswitch.conf contained references to ldap. Somehow, the latest stable udev tries to resolve a tss user/group and udevd hangs on that.

To fix this problem there are 2 or 3 quick solutions:
-upgrade to an unstable udevd (untested and may not work but the developers are aware of this problem)
-edit /etc/nsswitch.conf and remove ldap. Of course that's not a permanent solution unless you change your authentication scheme. But at least you will be able to boot ok.
-edit /etc/udev/rules.d/50-udev.rules and comment the entry for the tss user/group (search for KERNEL=="tpm)

There's a Gentoo bug report on this issue: https://bugs.gentoo.org/show_bug.cgi?id=99564

I think the latest udev ebuild was marked stable too soon (LDAP environments weren't tested?).

[EDIT1]:
upgrading to an unstable udevd does not solve the issue (as of Feb 28th 2006)

[EDIT2]:
Comenting out the line
KERNEL=="tpm*", NAME="%k", OWNER="tss", GROUP="tss", MODE="0600"
is a quick solution to avoid udev eternal lookups.
However there's another step that also blocks the init process: "Cleaning /tmp"
/etc/init.d/bootmisc
on the line
chown 0:0 /tmp/.{ICE,X11}-unix

If I comment that line out then the system boots ok (not a definite solution though).

System is EM64T (amd64 iso), latest udev and latest baselayout.
nsswitch.conf needs ldap in my case.


Last edited by Vieri on Tue Feb 28, 2006 5:57 pm; edited 1 time in total
Back to top
View user's profile Send private message
skyPhyr
Apprentice
Apprentice


Joined: 17 Sep 2004
Posts: 159
Location: London, UK

PostPosted: Tue Feb 28, 2006 5:15 pm    Post subject: Reply with quote

Hi Vieri,

Great timing - I've been battling with this exact same problem:- https://forums.gentoo.org/viewtopic-p-3146322.html

I can confirm these fixes also work for me.

Cheers,

Alan.
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 874

PostPosted: Tue Feb 28, 2006 6:05 pm    Post subject: Reply with quote

Glad it could help someone.
Strangely, this udev "bug" I mentioned above has been reported 6 months ago.
Hope they at least put a big ewarn for ldap users.
Back to top
View user's profile Send private message
sedorox
Apprentice
Apprentice


Joined: 13 Feb 2004
Posts: 206

PostPosted: Sat Mar 04, 2006 7:57 pm    Post subject: Reply with quote

This is weird... I have 2 machines... one is my new ldap test box.. the other is my 'production' box... Today (spring break, yay!) I booted up both. The 'test' box came up just fine, however, the 'production' box didn't. It hung at the udev thing. Your solution (commenting out the TPM device) did the trick.

Here's the kicker.... Both boxes has ldap (as server) and have the entries for nsswitch.conf... Both have udev-85... (I did notice -86 is out.. still gotta test). But one box had a problem, and one didn't...

Funky.....
Back to top
View user's profile Send private message
Vieri
l33t
l33t


Joined: 18 Dec 2005
Posts: 874

PostPosted: Sat Mar 04, 2006 9:41 pm    Post subject: Reply with quote

Do both boxes have the same sys-apps/baselayout version?
Back to top
View user's profile Send private message
sedorox
Apprentice
Apprentice


Joined: 13 Feb 2004
Posts: 206

PostPosted: Sun Mar 05, 2006 11:04 pm    Post subject: Reply with quote

Vieri wrote:
Do both boxes have the same sys-apps/baselayout version?


Actually, yes, they are the same:

'test' box:
1.12.0_pre16-r1

'production' box:
1.12.0_pre16-r1

Thought I should do updates.. there are updates to both udev and baselayout....
Back to top
View user's profile Send private message
twam
Apprentice
Apprentice


Joined: 15 Feb 2005
Posts: 189
Location: Ammerbuch, Germany

PostPosted: Mon Mar 13, 2006 10:11 pm    Post subject: Reply with quote

Same problem here with sys-apps/baselayout-1.12.0_pre16-r3 on 2 machines: emt64 and a pentium-m. :/
Back to top
View user's profile Send private message
net
n00b
n00b


Joined: 18 Mar 2006
Posts: 5

PostPosted: Sat Mar 18, 2006 10:55 pm    Post subject: Reply with quote

:evil: The same problem here after the laste emerge -uD world yesterday.
(system stable x86 : Linux sk-srv 2.6.14-hardened-r5 #1 PREEMPT Wed Feb 1 22:17:18 CET 2006 i686 Pentium II (Deschutes) GenuineIntel GNU/Linux)

As a workaround I removed ldap from nssswitch.conf

Any idea about that ?

It's not a big probem at this time, but i'm working on ldap , so it has to work in the future.

Regards
Back to top
View user's profile Send private message
sedorox
Apprentice
Apprentice


Joined: 13 Feb 2004
Posts: 206

PostPosted: Sun Mar 19, 2006 7:45 am    Post subject: Reply with quote

I've developed some other problems on my test box.. (yea.. the tpm bug finally appeared) but not a few things lag on start.. and i have problems with when slapd starts.. it tries to bind to itself.. and other stuff... nsswitch.conf related (looking for users) so I'm hesitant about upgrading my production box, however, I think I'm going to do it package by package, and see what breaks it...
Back to top
View user's profile Send private message
BernieKe
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jul 2002
Posts: 130
Location: California/Bangalore/Belgium

PostPosted: Thu Mar 30, 2006 6:04 am    Post subject: Reply with quote

putting the following in /etc/ldap.conf fixed the udev problem for me:
Code:
bind_policy soft
Back to top
View user's profile Send private message
sedorox
Apprentice
Apprentice


Joined: 13 Feb 2004
Posts: 206

PostPosted: Fri Apr 07, 2006 2:27 am    Post subject: Reply with quote

Ok.. here's the thing... I upgrade my 'production' box slowly, and it isn't baselayout. Its sys-auth/nss_ldap.
The system was running: 239-r1
As soon as I upgraded to 249 I started having issues
Mine is when slapd starts, it tries to bind to itself (itsn't this a bad thing?) and of course udev, and apache, and other things that start before slapd does.
The only fix was to do the 'bind_policy soft' thingy. besides downgrading, that I've found.
Granted, I don't know which version broke this, but at least we know what package it is... Maybe I should file a bug report? (tho I don't know what to report)
_________________
Home Desktop: Ryzen 3900X 3.8ghz | 32G Ram | 2x 1TB NVMe
Previous 7 Year Build: Intel i5-2400 3.1ghz | 16G Ram | 1x 60G SSD, 1x 1TB HDD
Back to top
View user's profile Send private message
Ausdonky
n00b
n00b


Joined: 12 May 2004
Posts: 15
Location: Brisbane, Oztralia :)

PostPosted: Tue Apr 18, 2006 10:38 am    Post subject: Reply with quote

Hi guys..

After having spent the last 4 hours thinking that my semi-production box had farked itself after a forced reboot (I was getting segfaults from udev?!) I managed to find out that there was nothing wrong with it?! It was ldap.. I managed to boot the bugger then re-enable ldap in the nsswitch.conf file but of course this was just a temp fix. Anyway.. after re-enabling ldap i rebooted to see if it still had issues but this time it just hung on udevd. In a fit of rage i gave the keyboard a good whack and then out of habit hit Ctrl-C and to my amazment it booted! I would assume that this will cause udev to not load devices after the point i break at but it will get you to a shell to fix it if you need to (rather than having to boot a livecd or similar)

btw i applied the patch as per above to the udev.rules file and this worked great. I also tried setting the bind_policy to soft but this didnt seem to work..

HTH

Andrew
Back to top
View user's profile Send private message
cantao
Apprentice
Apprentice


Joined: 07 Jan 2004
Posts: 166

PostPosted: Tue Apr 18, 2006 1:53 pm    Post subject: Reply with quote

I've had the same problem, as described here:

https://forums.gentoo.org/viewtopic-t-448608-highlight-.html

and commenting out the appropriate line on /etc/udev/rules.d/50-udev.rules worked flawlessly. No need to mess with /etc/ldap.conf (yes, I'm using ldap also).

I know it's something that can be easily sent to oblivion by a bad etc-update, but nice hack anyway :)

Thanks a lot, Cantão!
Back to top
View user's profile Send private message
sedorox
Apprentice
Apprentice


Joined: 13 Feb 2004
Posts: 206

PostPosted: Wed Apr 19, 2006 4:12 am    Post subject: Reply with quote

cantao wrote:

and commenting out the appropriate line on /etc/udev/rules.d/50-udev.rules worked flawlessly. No need to mess with /etc/ldap.conf (yes, I'm using ldap also).


This works.. however I found when starting other services (like ldap itself) or apache... etc.. that it tries to bind to ldap.. and since, for some reason, its one of the last things to be started, that it fails, so I needed the ldap.conf setting...I wish I knew exactly what caused this in the first place.. was working so fine untill that one package update...
_________________
Home Desktop: Ryzen 3900X 3.8ghz | 32G Ram | 2x 1TB NVMe
Previous 7 Year Build: Intel i5-2400 3.1ghz | 16G Ram | 1x 60G SSD, 1x 1TB HDD
Back to top
View user's profile Send private message
McManus
Apprentice
Apprentice


Joined: 10 Apr 2002
Posts: 176
Location: Austin, TX

PostPosted: Sun Jun 11, 2006 2:54 am    Post subject: Reply with quote

sedorox wrote:
cantao wrote:

and commenting out the appropriate line on /etc/udev/rules.d/50-udev.rules worked flawlessly. No need to mess with /etc/ldap.conf (yes, I'm using ldap also).


This works.. however I found when starting other services (like ldap itself) or apache... etc.. that it tries to bind to ldap.. and since, for some reason, its one of the last things to be started, that it fails, so I needed the ldap.conf setting...I wish I knew exactly what caused this in the first place.. was working so fine untill that one package update...


I am experiencing exactly the same thing. Any ideas, short of removing ldap support?
_________________
McManus
----
Linux user #267375 - http://counter.li.org
Back to top
View user's profile Send private message
sedorox
Apprentice
Apprentice


Joined: 13 Feb 2004
Posts: 206

PostPosted: Fri Jun 23, 2006 7:39 pm    Post subject: Reply with quote

McManus wrote:

I am experiencing exactly the same thing. Any ideas, short of removing ldap support?


Sorry it took me a while to get back to you.... here is what I have changed in my ldap.conf that has seemed to work:

Code:

bind_policy soft
nss_reconnect_tries 3


I also still have the tpm device commented out in /etc/udev/rules/50-udev.rules
_________________
Home Desktop: Ryzen 3900X 3.8ghz | 32G Ram | 2x 1TB NVMe
Previous 7 Year Build: Intel i5-2400 3.1ghz | 16G Ram | 1x 60G SSD, 1x 1TB HDD
Back to top
View user's profile Send private message
MorpheuS.Ibis
Tux's lil' helper
Tux's lil' helper


Joined: 22 Apr 2006
Posts: 143

PostPosted: Sun Oct 08, 2006 5:41 pm    Post subject: Reply with quote

i am just kind of a n00b in this but i also use LDAP and udev...
what about make nsswitch.conf a symlink and change it using local initscript (/etc/conf.d/local.start)? local starts at the end of booting process so network connection should be up and also the LDAP server (if you have it on that machine). also, change the symlink back when stopping the system (/etc/conf.d/local.stop)...
this actually kind of needs to have the local initscript for its disposal (so you dont mess with traffic shaping or something like that when experimenting with LDAP) so maybe creating an initscript for it (copied and a bit edited local shoud be sufficient) should be good idea. but there is one more thing....its too simple to work, but why dont give it a try? :wink:
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum