View previous topic :: View next topic |
Author |
Message |
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Wed Oct 24, 2012 7:41 am Post subject: [SOLVED] Block device /dev/sda1 is not a valid root device |
|
|
Hello,
I have a Dell PowerEdge SC 1430 system, on which I was running the 2.6.29.6 Gentoo kernel. I decided to upgrade to a later kernel and downloaded the sources for 3.3.8 using emerge. Then I built the new kernel using genkernel. The build completed successfuly, initramfs was created and grub updated properly. However, when I try to boot into a new kernel I get the following error message:
Code: | >> Loading modules
>> Determining root device...
!! Block device /dev/sda1 is not a valid root device...
!! Could not find the root block device in .
Please specify another value or: press Enter for the same, type "shell" for a shell, or "q" to skip.
root block device() ::
|
Here is my grub (note that the entries for the working kernel 2.6.29.6 and the broken kernel 3.3. are almost identical:
Code: |
title=Gentoo Linux (3.3.8-gentoo)
root (hd1,0)
kernel /boot/kernel-genkernel-x86_64-3.3.8-gentoo root=/dev/ram0 real_root=/dev/sda1 rootfstype=ext3 console=ttyS0,115200
initrd /boot/initramfs-genkernel-x86_64-3.3.8-gentoo
title Gentoo Linux 2.6.29 perfmon Genkernel
root (hd1,0)
kernel /boot/kernel-genkernel-x86_64-2.6.29.6 root=/dev/ram0 real_root=/dev/sda1 quiet console=ttyS0,115200
initrd /boot/initramfs-genkernel-x86_64-2.6.29.6
|
(Also tried without the rootfstype option -- still no luck).
The first suspicion is that I don't have the right SATA drivers compiled, but I think the drivers are all there. The output of lspci -k (when run on the working 2.6.29.6 kernel) is as follows:
Code: |
00:00.0 Host bridge: Intel Corporation 5000V Chipset Memory Controller Hub (rev 92)
00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4 Port 2 (rev 92)
Kernel driver in use: pcieport-driver
00:03.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4 Port 3 (rev 92)
Kernel driver in use: pcieport-driver
00:10.0 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev 92)
Kernel modules: i5k_amb
00:10.1 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev 92)
Kernel modules: i5k_amb
00:10.2 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev 92)
Kernel modules: i5k_amb
00:11.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers (rev 92)
00:13.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers (rev 92)
00:15.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev 92)
00:16.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev 92)
00:1c.0 PCI bridge: Intel Corporation 631xESB/632xESB/3100 Chipset PCI Express Root Port 1 (rev 09)
Kernel driver in use: pcieport-driver
00:1d.0 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #1 (rev 09)
Kernel driver in use: uhci_hcd
Kernel modules: uhci-hcd
00:1d.1 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #2 (rev 09)
Kernel driver in use: uhci_hcd
Kernel modules: uhci-hcd
00:1d.2 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #3 (rev 09)
Kernel driver in use: uhci_hcd
Kernel modules: uhci-hcd
00:1d.3 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #4 (rev 09)
Kernel driver in use: uhci_hcd
Kernel modules: uhci-hcd
00:1d.7 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset EHCI USB2 Controller (rev 09)
Kernel driver in use: ehci_hcd
Kernel modules: ehci-hcd
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d9)
00:1f.0 ISA bridge: Intel Corporation 631xESB/632xESB/3100 Chipset LPC Interface Controller (rev 09)
00:1f.1 IDE interface: Intel Corporation 631xESB/632xESB IDE Controller (rev 09)
Kernel driver in use: PIIX_IDE
Kernel modules: ata_piix, pata_acpi
00:1f.2 IDE interface: Intel Corporation 631xESB/632xESB/3100 Chipset SATA IDE Controller (rev 09)
Kernel driver in use: ata_piix
Kernel modules: ata_piix, pata_acpi
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751 Gigabit Ethernet PCI Express (rev 21)
Kernel driver in use: tg3
Kernel modules: tg3
02:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Upstream Port (rev 01)
Kernel driver in use: pcieport-driver
02:00.3 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express to PCI-X Bridge (rev 01)
03:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream Port E1 (rev 01)
Kernel driver in use: pcieport-driver
03:01.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream Port E2 (rev 01)
Kernel driver in use: pcieport-driver
08:02.0 Serial controller: Lava Computer mfg Inc Lava DSerial-PCI Port A
Kernel driver in use: serial
08:02.1 Serial controller: Lava Computer mfg Inc Lava DSerial-PCI Port B
Kernel driver in use: serial
08:09.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
|
I need drivers ata_piix and pata_acpi for my SATA controller. Both are enabled in the 3.3.8 kernel configuration:
Code: |
CONFIG_ATA_PIIX=y
CONFIG_PATA_ACPI=y
|
Indeed, when the 3.3.8 kernel attempts to boot (and before it stops) it outputs the following messages, suggesting that sda1 is successfully recognized:
Code: |
sd 0:0:0:0: [sda] 156250000 512-byte logical blocks: (80.0 GB/74.5 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3 < sda5 >
sd 0:0:0:0: [sda] Attached SCSI disk
|
However, when I drop into the ashell and examine the contents of /dev, there is no sda there. All I see is:
Another hint that makes me think that this is not a driver problem is that when I try to boot the same kernel on QEMU (obviously with very different emulated hardware), I get the same error, while the 2.6.29.6 kernel boots on QEMU just fine.
I even tried rebuilding the kernel using the old config from 2.6.29.6 kernel, like this:
Code: |
sudo genkernel --kernel-config=/usr/src/linux-2.6.29.6/.config
|
I verified that the final config was actually based on the one from 2.6.29.6. No luck.
I also tried comparing the diff of the two configurations, the one from the working 2.6.29.6 kernel and the default one created by genkernel for 3.3.8, and enabling in 3.3.8 any options relating to ATA and SATA that were present in 2.6.29.6. No luck.
Here is the contents of my fstab (from the working 2.6.29.6 boot):
Code: |
# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.
#/dev/BOOT /boot ext2 noauto,noatime 1 2
/dev/sda1 / ext3 noatime 0 1
/dev/sda2 none swap sw 0 0
/dev/cdrom /mnt/cdrom auto noauto,ro 0 0
|
[EDIT]: ]
I now have a little more information on what is going on. The system is not properly populating /dev during boot. When I interrupt the working boot sequence (of the 2.6.29 kernel), right after it reports "Loading modules" and attempts to mount root and drop into a shell, I see the following (truncated) contents of /dev. As you can see, it has lots of devices, including sda.
Code: |
0:0:0:0 sda1 tty34
1:0:0:0 sda2 tty35
console sda3 tty36
cpu_dma_latency sda5 tty37
event0 sdb tty38
full sdb1 tty39
hda sdb10 tty4
kmem sdb11 tty40
kmsg sdb5 tty41
loop0 sdb6 tty42
loop1 sdb7 tty43
loop2 sdb8 tty44
|
But when I look at the contents of /dev when the non-working boot is interrupted (also after Loading Modules and attempting to mount root), I see only these three devices in /dev! That's it! Where is everything else? I think the problem is hidden somewhere here.
The reason, I think, is that mdev is not being activated during boot. When I boot the working kernel, I see the following messages on boot:
Code: |
>> Loading modules
:: Scanning for imm... module not found.
>> Activating mdev
>> Determining root device...
|
When I boot the broken kernel, all I see is:
Code: |
>> Loading modules
>> Determining root device...
|
So no message stating that mdev was activated!!! Anybody has any idea how to get more light into what's going on?
I would sincerely appreciate any help you might offer.
Last edited by solange666 on Fri Oct 26, 2012 12:55 am; edited 3 times in total |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Wed Oct 24, 2012 7:54 am Post subject: |
|
|
Look at the "root" command in your grub.conf:
That would be the first partition on the second hard drive detected by
grub, /dev/sdb1, not /dev/sda1. sda1 would be
? _________________ TIA |
|
Back to top |
|
|
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Wed Oct 24, 2012 5:01 pm Post subject: |
|
|
Unfortunately that does not work. I get stuck in the very beginning of the boot sequence. You see, the kernel that boots successfully also uses (hd1, 0). Plus, when I attempt to boot the system on QEMU, with identical grub settings as for the working kernel (but different than those used on a real system), I get the same error!
But thank you very much for your reply. I appreciate your attention. |
|
Back to top |
|
|
DONAHUE Watchman
Joined: 09 Dec 2006 Posts: 7651 Location: Goose Creek SC
|
Posted: Thu Oct 25, 2012 3:27 am Post subject: |
|
|
in menuconfig?
Quote: | Device Drivers --->
Generic Driver Options --->
(/sbin/hotplug) path to uevent helper
[*] Maintain a devtmpfs filesystem to mount at /dev |
_________________ Defund the FCC. |
|
Back to top |
|
|
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Thu Oct 25, 2012 5:55 am Post subject: |
|
|
Had tried this earlier today. Had high hopes for it, but no luck. Sigh. |
|
Back to top |
|
|
jrussia Tux's lil' helper
Joined: 29 Aug 2012 Posts: 89 Location: Chicago
|
Posted: Thu Oct 25, 2012 6:25 am Post subject: |
|
|
In Device Drivers -> SCSI device support, do you have
SCSI disk support
SCSI generic support
? |
|
Back to top |
|
|
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Thu Oct 25, 2012 7:13 am Post subject: |
|
|
Jrussia,
Yes, unfortunately I do have those set. |
|
Back to top |
|
|
jrussia Tux's lil' helper
Joined: 29 Aug 2012 Posts: 89 Location: Chicago
|
Posted: Thu Oct 25, 2012 7:56 am Post subject: |
|
|
Have you tried using ata_piix instead of PIIX_IDE here:
Code: | IDE interface: Intel Corporation 631xESB/632xESB IDE Controller (rev 09)
Kernel driver in use: PIIX_IDE
Kernel modules: ata_piix, pata_acpi |
|
|
Back to top |
|
|
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Thu Oct 25, 2012 8:45 am Post subject: |
|
|
I haven't, but you see, I don't think this would be right, because that lspci output was from the working kernel 2.6.29, so apparently that driver works just fine there... Plus I have the very same problem on QEMU, with very different hardware. All devices appear to be identified correctly (judging by console messages during boot and comparing to a successful boot), but then /dev/hda (it's not sda on QEMU) is not found. I suspect that initramfs is not creating /dev/sda (or hda) when it should. Will verify tomorrow.
Good night and thank you for your suggestions. |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Thu Oct 25, 2012 9:13 am Post subject: |
|
|
What version of grub is this? Device numbering changes in grub2.
Assuming it is not grub2, what are the contents of /boot/grub/device.map?
(In grub1, (hd0) is by default the first hard drive. I did not see a second
hard drive in your lspci output, so I am wondering what grub is going
to think (hd1) refers to.)
You could have your root filesystem on an sdb (hd1 to grub1) if one was
actually installed, but your fstab would be different than if the root
partition was on the first hard drive.
A plugged-in usb stick should not change this (if you have a hard
drive and a usb stick attached, the hard drives comes up as sda
and the usb device as sdb, ie hd0 and hd1 to grub1.) _________________ TIA |
|
Back to top |
|
|
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Thu Oct 25, 2012 4:25 pm Post subject: |
|
|
Thank you for your suggestion. I doubt that device naming could be the problem. The 2.6.29 kernel boots on the same system under the same grub just fine off the /dev/sda1 device. 3.3.8 (or 3.6.3) does not. When I try to boot on QEMU, which uses /dev/hda1 boot device, same situation: 2.6.29 boots, but the 3.* kernels do not.
To answer your questions, the grub version is: grub (GNU GRUB 0.97)
the contents of device.map is:
Code: |
(fd0) /dev/fd0
(hd0) /dev/sda
(hd1) /dev/sdb
|
I now have a little more information on what is going on. The system is not properly populating /dev during boot. When I interrupt the working boot sequence (of the 2.6.29 kernel), right after it reports "Loading modules" and attempts to mount root and drop into a shell, I see the following (truncated) contents of /dev. As you can see, it has lots of devices, including sda.
Code: |
0:0:0:0 sda1 tty34
1:0:0:0 sda2 tty35
console sda3 tty36
cpu_dma_latency sda5 tty37
event0 sdb tty38
full sdb1 tty39
hda sdb10 tty4
kmem sdb11 tty40
kmsg sdb5 tty41
loop0 sdb6 tty42
loop1 sdb7 tty43
loop2 sdb8 tty44
|
But when I look at the contents of /dev when the non-working boot is interrupted (also after Loading Modules and attempting to mount root), I see only these three devices in /dev! That's it! Where is everything else? I think the problem is hidden somewhere here.
|
|
Back to top |
|
|
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Thu Oct 25, 2012 11:34 pm Post subject: |
|
|
New update: the reason why /dev is not being populated properly, I think, is that mdev is not being activated during boot. When I boot the working kernel, I see the following messages on boot:
Code: |
>> Loading modules
:: Scanning for imm... module not found.
>> Activating mdev
>> Determining root device...
|
When I boot the broken kernel, all I see is:
Code: |
>> Loading modules
>> Determining root device...
|
So no message stating that mdev was activated!!!
I made sure that DEVTMPFS, SYSFS and HOTPLUG are enabled:
Code: |
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_SYSFS=y
CONFIG_HOTPLUG=y
|
Anybody has any idea how to get more light into what's going on? Or tips on how I can get more insight on what's going on? Perhaps how I could add more debugging messages to the init script packaged into initramfs to see why mdev is not activated?
Thanks! |
|
Back to top |
|
|
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Fri Oct 26, 2012 12:35 am Post subject: |
|
|
I am pretty sure I know what the problem is. I went to dig why it could be that mdev is not being activated. It should be activated from the init script (which I found by decompressing the initramfs created by genkernel), from the following function:
Code: |
# Start device manager
start_dev_mgr
|
which is defined in /etc/initrd.scripts in the initramfs. I checked this script, and here is how the function gets called:
Code: |
start_dev_mgr() {
if [ "${KV_2_6_OR_GREATER}" ]
then
cd /sys
[ "${DO_slowusb}" ] && sdelay
check_slowusb
[ "${FORCE_slowusb}" ] && sdelay
good_msg 'Activating mdev'
runmdev
[ "${DO_slowusb}" ] || \
[ "${FORCE_slowusb}" ] && sdelay
cd /
fi
}
|
So my first question is: does this if-statement get triggered? So I went and looked at how KV_2_6_OR_GREATER gets set. And, here comes the best part, here is how:
Code: |
if [ "${KMAJOR}" -eq 2 -a "${KMINOR}" -ge '6' ]
then
KV_2_6_OR_GREATER="yes"
fi
|
So, if we are running version 2.6, 2.7, 2.8 or anything up to 2.9, anything where the major version is 2 we'll be fine. BUT as long as we go above 2 in the major version, which is what I am doing, with my 3.* kernels, KV_2_6_OR_GREATER does not get set! So mdev does not get activated, and probably a lot of other bad things happen too, and the boot fails as a result. Apparently the smart person who wrote the script never thought we'd go to 3 as the major version, but here we are.
I have not yet verified that fixing this will solve the problem, but it looks pretty likely that it will. Now, my question is: after I fix the buggy script, how do I get it back into initramfs? I decompressed it, how do I compress it back? (I probably need to search online for that). I don't want to create initramfs myself from scratch, I'd rather use the genkernel's version, so I just need to know how to re-compress the initramfs, which was originally created by genkernel, and which I decompressed. |
|
Back to top |
|
|
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Fri Oct 26, 2012 1:07 am Post subject: |
|
|
YES!!! That did it! Here is what I had to do:
1. Decompress initramfs (as described in this post ):
Code: |
$ mkdir -p /tmp/initramfs
$ cd /tmp/initramfs
$ cp /boot/initramfs-genkernel-x86_64-3.6.3 initramfs.gz
$ gzip -d initramfs.gz
$ sudo cpio -i < initramfs
|
2. Fixed the bug by modifying the following lines in /etc/initrd.defaults
Code: |
if [ "${KMAJOR}" -eq 2 -a "${KMINOR}" -ge '6' ]
then
KV_2_6_OR_GREATER="yes"
fi
|
to work in case we have a major version of 3 and greater.
3. Re-compressed initramfs like so (from the directory containing the decompressed initramfs contents):
Code: |
sudo find . | cpio --quiet -H newc -o | gzip -9 -n > initramfs-genkernel-x86_64-3.6.3
|
4. Copied the new initframfs archive to /boot, updated grub to point to the new archive and rebooted.
WOW, this is so amazing how such a small silly bug could cause someone to lose 3 days over this! Hope others find this helpful.
The reason why I ran into this bug could be because my version of genkernel is pretty old. I'll try to upgrade and see if the problem is fixed there. |
|
Back to top |
|
|
PacGyver n00b
Joined: 13 Jun 2006 Posts: 6
|
Posted: Sun Nov 04, 2012 11:45 am Post subject: |
|
|
You made my day! I already spent many days for this issue...
I could fix the problem by upgrading genkernel-3.4.10.907 -> genkernel-3.4.24_p2
initrd.defaults generated with new genkernel version:
http://pastebin.com/G3wPNpDa
if [ "${KMAJOR}" -ge 3 ] || [ "${KMAJOR}" -eq 2 -a "${KMINOR}" -eq '6' ]
then
KV_2_6_OR_GREATER="yes"
fi |
|
Back to top |
|
|
solange666 n00b
Joined: 24 Oct 2012 Posts: 12
|
Posted: Sun Nov 04, 2012 6:50 pm Post subject: |
|
|
You know, to be able to complete the upgrade from 2.6 kernel to 3.6, I also had to upgrade openrc and baselayout as described here. After the mdev issue described in this thread was solved, the system wouldn't boot anyway because the rc script would try to mount the /proc file system twice and bail if it found /proc already mounted. That issue was solved in the newer rc.
The bottom line is that the tools have to match with the kernel version, so it's best to upgrade all the system components together. Unfortunately portage won't tell you if the kernel sources you emerging do not match the genkernel or the openrc version. But, oh well.... One step at a time. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|