Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] Block device /dev/sda1 is not a valid root device
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Wed Oct 24, 2012 7:41 am    Post subject: [SOLVED] Block device /dev/sda1 is not a valid root device Reply with quote

Hello,

I have a Dell PowerEdge SC 1430 system, on which I was running the 2.6.29.6 Gentoo kernel. I decided to upgrade to a later kernel and downloaded the sources for 3.3.8 using emerge. Then I built the new kernel using genkernel. The build completed successfuly, initramfs was created and grub updated properly. However, when I try to boot into a new kernel I get the following error message:

Code:
 >> Loading modules
 >> Determining root device...
 !! Block device /dev/sda1 is not a valid root device...
 !! Could not find the root block device in .
  Please specify another value or: press Enter for the same, type "shell" for a shell, or "q" to skip.
root block device() ::


Here is my grub (note that the entries for the working kernel 2.6.29.6 and the broken kernel 3.3.8) are almost identical:

Code:

title=Gentoo Linux (3.3.8-gentoo)
root (hd1,0)
kernel /boot/kernel-genkernel-x86_64-3.3.8-gentoo root=/dev/ram0 real_root=/dev/sda1 rootfstype=ext3  console=ttyS0,115200
initrd /boot/initramfs-genkernel-x86_64-3.3.8-gentoo

title Gentoo Linux 2.6.29 perfmon Genkernel
root (hd1,0)
kernel /boot/kernel-genkernel-x86_64-2.6.29.6 root=/dev/ram0 real_root=/dev/sda1 quiet console=ttyS0,115200
initrd /boot/initramfs-genkernel-x86_64-2.6.29.6



(Also tried without the rootfstype option -- still no luck).

The first suspicion is that I don't have the right SATA drivers compiled, but I think the drivers are all there. The output of lspci -k (when run on the working 2.6.29.6 kernel) is as follows:
Code:
 
00:00.0 Host bridge: Intel Corporation 5000V Chipset Memory Controller Hub (rev 92)
00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4 Port 2 (rev 92)
        Kernel driver in use: pcieport-driver
00:03.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4 Port 3 (rev 92)
        Kernel driver in use: pcieport-driver
00:10.0 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev 92)
        Kernel modules: i5k_amb
00:10.1 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev 92)
        Kernel modules: i5k_amb
00:10.2 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers (rev 92)
        Kernel modules: i5k_amb
00:11.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers (rev 92)
00:13.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers (rev 92)
00:15.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev 92)
00:16.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev 92)
00:1c.0 PCI bridge: Intel Corporation 631xESB/632xESB/3100 Chipset PCI Express Root Port 1 (rev 09)
        Kernel driver in use: pcieport-driver
00:1d.0 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #1 (rev 09)
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci-hcd
00:1d.1 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #2 (rev 09)
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci-hcd
00:1d.2 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #3 (rev 09)
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci-hcd
00:1d.3 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #4 (rev 09)
        Kernel driver in use: uhci_hcd
        Kernel modules: uhci-hcd
00:1d.7 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset EHCI USB2 Controller (rev 09)
        Kernel driver in use: ehci_hcd
        Kernel modules: ehci-hcd
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d9)
00:1f.0 ISA bridge: Intel Corporation 631xESB/632xESB/3100 Chipset LPC Interface Controller (rev 09)
00:1f.1 IDE interface: Intel Corporation 631xESB/632xESB IDE Controller (rev 09)
        Kernel driver in use: PIIX_IDE
        Kernel modules: ata_piix, pata_acpi
00:1f.2 IDE interface: Intel Corporation 631xESB/632xESB/3100 Chipset SATA IDE Controller (rev 09)
        Kernel driver in use: ata_piix
        Kernel modules: ata_piix, pata_acpi
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751 Gigabit Ethernet PCI Express (rev 21)
        Kernel driver in use: tg3
        Kernel modules: tg3
02:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Upstream Port (rev 01)
        Kernel driver in use: pcieport-driver
02:00.3 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express to PCI-X Bridge (rev 01)
03:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream Port E1 (rev 01)
        Kernel driver in use: pcieport-driver
03:01.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream Port E2 (rev 01)
        Kernel driver in use: pcieport-driver
08:02.0 Serial controller: Lava Computer mfg Inc Lava DSerial-PCI Port A
        Kernel driver in use: serial
08:02.1 Serial controller: Lava Computer mfg Inc Lava DSerial-PCI Port B
        Kernel driver in use: serial
08:09.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)


I need drivers ata_piix and pata_acpi for my SATA controller. Both are enabled in the 3.3.8 kernel configuration:

Code:

CONFIG_ATA_PIIX=y
CONFIG_PATA_ACPI=y


Indeed, when the 3.3.8 kernel attempts to boot (and before it stops) it outputs the following messages, suggesting that sda1 is successfully recognized:

Code:

sd 0:0:0:0: [sda] 156250000 512-byte logical blocks: (80.0 GB/74.5 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3 < sda5 >
sd 0:0:0:0: [sda] Attached SCSI disk


However, when I drop into the ashell and examine the contents of /dev, there is no sda there. All I see is:

Code:

console  null     tty1


Another hint that makes me think that this is not a driver problem is that when I try to boot the same kernel on QEMU (obviously with very different emulated hardware), I get the same error, while the 2.6.29.6 kernel boots on QEMU just fine.

I even tried rebuilding the kernel using the old config from 2.6.29.6 kernel, like this:

Code:

sudo genkernel --kernel-config=/usr/src/linux-2.6.29.6/.config


I verified that the final config was actually based on the one from 2.6.29.6. No luck.

I also tried comparing the diff of the two configurations, the one from the working 2.6.29.6 kernel and the default one created by genkernel for 3.3.8, and enabling in 3.3.8 any options relating to ATA and SATA that were present in 2.6.29.6. No luck.

Here is the contents of my fstab (from the working 2.6.29.6 boot):

Code:

# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.
#/dev/BOOT              /boot           ext2            noauto,noatime  1 2
/dev/sda1               /               ext3            noatime         0 1
/dev/sda2               none            swap            sw              0 0
/dev/cdrom              /mnt/cdrom      auto            noauto,ro       0 0


[EDIT]: ]

I now have a little more information on what is going on. The system is not properly populating /dev during boot. When I interrupt the working boot sequence (of the 2.6.29 kernel), right after it reports "Loading modules" and attempts to mount root and drop into a shell, I see the following (truncated) contents of /dev. As you can see, it has lots of devices, including sda.


Code:

0:0:0:0             sda1                tty34
1:0:0:0             sda2                tty35
console             sda3                tty36
cpu_dma_latency     sda5                tty37
event0              sdb                 tty38
full                sdb1                tty39
hda                 sdb10               tty4
kmem                sdb11               tty40
kmsg                sdb5                tty41
loop0               sdb6                tty42
loop1               sdb7                tty43
loop2               sdb8                tty44


But when I look at the contents of /dev when the non-working boot is interrupted (also after Loading Modules and attempting to mount root), I see only these three devices in /dev! That's it! Where is everything else? I think the problem is hidden somewhere here.

Code:

console  null     tty1


The reason, I think, is that mdev is not being activated during boot. When I boot the working kernel, I see the following messages on boot:

Code:

>> Loading modules
   :: Scanning for imm... module not found.
>> Activating mdev
>> Determining root device...


When I boot the broken kernel, all I see is:

Code:

>> Loading modules
>> Determining root device...


So no message stating that mdev was activated!!! Anybody has any idea how to get more light into what's going on?
I would sincerely appreciate any help you might offer.


Last edited by solange666 on Fri Oct 26, 2012 12:55 am; edited 3 times in total
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Wed Oct 24, 2012 7:54 am    Post subject: Reply with quote

Look at the "root" command in your grub.conf:
Code:

root (hd1,0)


That would be the first partition on the second hard drive detected by
grub, /dev/sdb1, not /dev/sda1. sda1 would be
Code:

root (hd0,0)


?
_________________
TIA
Back to top
View user's profile Send private message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Wed Oct 24, 2012 5:01 pm    Post subject: Reply with quote

Unfortunately that does not work. I get stuck in the very beginning of the boot sequence. You see, the kernel that boots successfully also uses (hd1, 0). Plus, when I attempt to boot the system on QEMU, with identical grub settings as for the working kernel (but different than those used on a real system), I get the same error!

But thank you very much for your reply. I appreciate your attention.
Back to top
View user's profile Send private message
DONAHUE
Watchman
Watchman


Joined: 09 Dec 2006
Posts: 7651
Location: Goose Creek SC

PostPosted: Thu Oct 25, 2012 3:27 am    Post subject: Reply with quote

in menuconfig?
Quote:
Device Drivers --->
Generic Driver Options --->
(/sbin/hotplug) path to uevent helper
[*] Maintain a devtmpfs filesystem to mount at /dev

_________________
Defund the FCC.
Back to top
View user's profile Send private message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Thu Oct 25, 2012 5:55 am    Post subject: Reply with quote

Had tried this earlier today. Had high hopes for it, but no luck. Sigh.
Back to top
View user's profile Send private message
jrussia
Tux's lil' helper
Tux's lil' helper


Joined: 29 Aug 2012
Posts: 89
Location: Chicago

PostPosted: Thu Oct 25, 2012 6:25 am    Post subject: Reply with quote

In Device Drivers -> SCSI device support, do you have
SCSI disk support
SCSI generic support

?
Back to top
View user's profile Send private message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Thu Oct 25, 2012 7:13 am    Post subject: Reply with quote

Jrussia,

Yes, unfortunately I do have those set. :(
Back to top
View user's profile Send private message
jrussia
Tux's lil' helper
Tux's lil' helper


Joined: 29 Aug 2012
Posts: 89
Location: Chicago

PostPosted: Thu Oct 25, 2012 7:56 am    Post subject: Reply with quote

Have you tried using ata_piix instead of PIIX_IDE here:

Code:
IDE interface: Intel Corporation 631xESB/632xESB IDE Controller (rev 09)
        Kernel driver in use: PIIX_IDE
        Kernel modules: ata_piix, pata_acpi
Back to top
View user's profile Send private message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Thu Oct 25, 2012 8:45 am    Post subject: Reply with quote

I haven't, but you see, I don't think this would be right, because that lspci output was from the working kernel 2.6.29, so apparently that driver works just fine there... Plus I have the very same problem on QEMU, with very different hardware. All devices appear to be identified correctly (judging by console messages during boot and comparing to a successful boot), but then /dev/hda (it's not sda on QEMU) is not found. I suspect that initramfs is not creating /dev/sda (or hda) when it should. Will verify tomorrow.

Good night and thank you for your suggestions.
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Thu Oct 25, 2012 9:13 am    Post subject: Reply with quote

What version of grub is this? Device numbering changes in grub2.

Assuming it is not grub2, what are the contents of /boot/grub/device.map?

(In grub1, (hd0) is by default the first hard drive. I did not see a second
hard drive in your lspci output, so I am wondering what grub is going
to think (hd1) refers to.)

You could have your root filesystem on an sdb (hd1 to grub1) if one was
actually installed, but your fstab would be different than if the root
partition was on the first hard drive.

A plugged-in usb stick should not change this (if you have a hard
drive and a usb stick attached, the hard drives comes up as sda
and the usb device as sdb, ie hd0 and hd1 to grub1.)
_________________
TIA
Back to top
View user's profile Send private message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Thu Oct 25, 2012 4:25 pm    Post subject: Reply with quote

Thank you for your suggestion. I doubt that device naming could be the problem. The 2.6.29 kernel boots on the same system under the same grub just fine off the /dev/sda1 device. 3.3.8 (or 3.6.3) does not. When I try to boot on QEMU, which uses /dev/hda1 boot device, same situation: 2.6.29 boots, but the 3.* kernels do not.

To answer your questions, the grub version is: grub (GNU GRUB 0.97)

the contents of device.map is:

Code:

(fd0)   /dev/fd0
(hd0)   /dev/sda
(hd1)   /dev/sdb


I now have a little more information on what is going on. The system is not properly populating /dev during boot. When I interrupt the working boot sequence (of the 2.6.29 kernel), right after it reports "Loading modules" and attempts to mount root and drop into a shell, I see the following (truncated) contents of /dev. As you can see, it has lots of devices, including sda.


Code:

0:0:0:0             sda1                tty34
1:0:0:0             sda2                tty35
console             sda3                tty36
cpu_dma_latency     sda5                tty37
event0              sdb                 tty38
full                sdb1                tty39
hda                 sdb10               tty4
kmem                sdb11               tty40
kmsg                sdb5                tty41
loop0               sdb6                tty42
loop1               sdb7                tty43
loop2               sdb8                tty44


But when I look at the contents of /dev when the non-working boot is interrupted (also after Loading Modules and attempting to mount root), I see only these three devices in /dev! That's it! Where is everything else? I think the problem is hidden somewhere here.

Code:

console  null     tty1
Back to top
View user's profile Send private message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Thu Oct 25, 2012 11:34 pm    Post subject: Reply with quote

New update: the reason why /dev is not being populated properly, I think, is that mdev is not being activated during boot. When I boot the working kernel, I see the following messages on boot:

Code:

>> Loading modules
   :: Scanning for imm... module not found.
>> Activating mdev
>> Determining root device...


When I boot the broken kernel, all I see is:

Code:

>> Loading modules
>> Determining root device...


So no message stating that mdev was activated!!!

I made sure that DEVTMPFS, SYSFS and HOTPLUG are enabled:

Code:

CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_SYSFS=y
CONFIG_HOTPLUG=y


Anybody has any idea how to get more light into what's going on? Or tips on how I can get more insight on what's going on? Perhaps how I could add more debugging messages to the init script packaged into initramfs to see why mdev is not activated?

Thanks!
Back to top
View user's profile Send private message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Fri Oct 26, 2012 12:35 am    Post subject: Reply with quote

I am pretty sure I know what the problem is. I went to dig why it could be that mdev is not being activated. It should be activated from the init script (which I found by decompressing the initramfs created by genkernel), from the following function:

Code:

# Start device manager
start_dev_mgr


which is defined in /etc/initrd.scripts in the initramfs. I checked this script, and here is how the function gets called:

Code:

start_dev_mgr() {
        if [ "${KV_2_6_OR_GREATER}" ]
        then
                cd /sys
                [ "${DO_slowusb}" ] && sdelay
                check_slowusb
                [ "${FORCE_slowusb}" ] && sdelay
                good_msg 'Activating mdev'
                runmdev
                [ "${DO_slowusb}" ] || \
                [ "${FORCE_slowusb}" ] && sdelay
                cd /
        fi
}


So my first question is: does this if-statement get triggered? So I went and looked at how KV_2_6_OR_GREATER gets set. And, here comes the best part, here is how:

Code:

if [ "${KMAJOR}" -eq 2 -a "${KMINOR}" -ge '6' ]
then
        KV_2_6_OR_GREATER="yes"
fi


So, if we are running version 2.6, 2.7, 2.8 or anything up to 2.9, anything where the major version is 2 we'll be fine. BUT as long as we go above 2 in the major version, which is what I am doing, with my 3.* kernels, KV_2_6_OR_GREATER does not get set! So mdev does not get activated, and probably a lot of other bad things happen too, and the boot fails as a result. Apparently the smart person who wrote the script never thought we'd go to 3 as the major version, but here we are.

I have not yet verified that fixing this will solve the problem, but it looks pretty likely that it will. Now, my question is: after I fix the buggy script, how do I get it back into initramfs? I decompressed it, how do I compress it back? (I probably need to search online for that). I don't want to create initramfs myself from scratch, I'd rather use the genkernel's version, so I just need to know how to re-compress the initramfs, which was originally created by genkernel, and which I decompressed.
Back to top
View user's profile Send private message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Fri Oct 26, 2012 1:07 am    Post subject: Reply with quote

YES!!! That did it! Here is what I had to do:

1. Decompress initramfs (as described in this post ):

Code:

$ mkdir -p /tmp/initramfs
$ cd /tmp/initramfs
$ cp /boot/initramfs-genkernel-x86_64-3.6.3 initramfs.gz
$ gzip -d initramfs.gz
$ sudo cpio -i < initramfs


2. Fixed the bug by modifying the following lines in /etc/initrd.defaults

Code:

if [ "${KMAJOR}" -eq 2 -a "${KMINOR}" -ge '6' ]
then
        KV_2_6_OR_GREATER="yes"
fi


to work in case we have a major version of 3 and greater.

3. Re-compressed initramfs like so (from the directory containing the decompressed initramfs contents):

Code:

sudo find . | cpio --quiet -H newc -o | gzip -9 -n > initramfs-genkernel-x86_64-3.6.3


4. Copied the new initframfs archive to /boot, updated grub to point to the new archive and rebooted.

WOW, this is so amazing how such a small silly bug could cause someone to lose 3 days over this! Hope others find this helpful.

The reason why I ran into this bug could be because my version of genkernel is pretty old. I'll try to upgrade and see if the problem is fixed there.
Back to top
View user's profile Send private message
PacGyver
n00b
n00b


Joined: 13 Jun 2006
Posts: 6

PostPosted: Sun Nov 04, 2012 11:45 am    Post subject: Reply with quote

You made my day! I already spent many days for this issue...
I could fix the problem by upgrading genkernel-3.4.10.907 -> genkernel-3.4.24_p2

initrd.defaults generated with new genkernel version:
http://pastebin.com/G3wPNpDa

if [ "${KMAJOR}" -ge 3 ] || [ "${KMAJOR}" -eq 2 -a "${KMINOR}" -eq '6' ]

then

KV_2_6_OR_GREATER="yes"

fi
Back to top
View user's profile Send private message
solange666
n00b
n00b


Joined: 24 Oct 2012
Posts: 12

PostPosted: Sun Nov 04, 2012 6:50 pm    Post subject: Reply with quote

You know, to be able to complete the upgrade from 2.6 kernel to 3.6, I also had to upgrade openrc and baselayout as described here. After the mdev issue described in this thread was solved, the system wouldn't boot anyway because the rc script would try to mount the /proc file system twice and bail if it found /proc already mounted. That issue was solved in the newer rc.

The bottom line is that the tools have to match with the kernel version, so it's best to upgrade all the system components together. Unfortunately portage won't tell you if the kernel sources you emerging do not match the genkernel or the openrc version. But, oh well.... One step at a time.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum