Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Kernel keeps renaming root device [UNSOLVABLE]
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sat Jul 20, 2013 11:34 pm    Post subject: Kernel keeps renaming root device [UNSOLVABLE] Reply with quote

I can no longer boot because the root device I specify is never correct. The root device is supposed to be /dev/sda5. However, it appears that the kernel is now giving it a different name every time I try to boot.

When I try to boot, I almost always get a kernel panic because the kernel has named the device "/dev/sde" or "/dev/sdg" or anything BUT /dev/sda.

If the kernel names the boot device /dev/sdc, I'll edit the line in grub's menu.lst to "root=/dev/sdc5"--but then the kernel switched the name on me again and names the boot device something else. There is no way for me to guess what the kernel is going to name the boot device and I can't find a way to make it stop.

I've tried specifying the root device by UUID in menu.lst. I found some instructions online for doing this. It doesn't work--grub doesn't seem to be able to do that sort of thing. (Then why are there instructions for it?)

I've tried creating udev rules to try to keep the device names from changing, but that didn't work either.

What can I do to keep the device names from changing so I can boot?


Last edited by exhausted on Thu Apr 03, 2014 1:48 am; edited 2 times in total
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sat Jul 20, 2013 11:36 pm    Post subject: Reply with quote

Specifying the UUID for / in /etc/fstab doesn't seem to help either.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 31365
Location: 56N 3W

PostPosted: Sun Jul 21, 2013 12:18 am    Post subject: Reply with quote

exhausted,

Explain the storage devices attached your your system and how they are connected.

For finding root by UUID, you need an initrd. It works for me.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sun Jul 21, 2013 12:37 am    Post subject: Reply with quote

Thanks for the quick reply!

Gah! I forgot that the kernel can't interpret UUIDs passed directly to it via a bootloader. No wonder my attempt at getting Grub to work by specifying the UUID didn't work. Since I'm not using an initramfs and would prefer not to, maybe instead of trying to specify the root device by UUID, I should go back to my original plan: Use a rule to force udev to use the device names I specify.

The devices are two solid state drives connected via SATA and a couple of external hard drives connected via USB. The root device is a SATA SSD.
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sun Jul 21, 2013 1:00 am    Post subject: Reply with quote

Okay, here's what I've done:

I've changed the relevant line in menu.lst back to specifying "root=/dev/sda5".

I created a udev rule file, /etc/udev/rules.d/20-persistent_disk_name.rules with the following rule:

Code:
SUBSYSTEM=="scsi", ATTRS{model}=="INTEL SSDSA2CW08", KERNEL=="sd*", SYMLINK+="sda%n"



I still can't boot because the kernel is still changing the name of the boot device. The udev rule doesn't appear to do anything.

Edit: Maybe the udev rules are useless for forcing a specific device name for a boot device? Is it not possible to force the correct root device name using a udev rule?
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sun Jul 21, 2013 1:08 am    Post subject: Reply with quote

I have also tried specifying the root device in /etc/fstab by UUID:

Code:
UUID=ebbc6ab0-0c0f-4d21-98e6-63ac2ee4d84d      /      reiserfs   defaults,noatime,data=ordered,notail   1 2



This doesn't seem to work either.
Back to top
View user's profile Send private message
VoidMage
Watchman
Watchman


Joined: 14 Oct 2006
Posts: 5270

PostPosted: Sun Jul 21, 2013 1:43 am    Post subject: Reply with quote

If your disk has GPT partition table, you could boot by PARTUUID (well, if your kernel is recent enough, you could even do it with MBR partition, though it's a bit quirky there).
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sun Jul 21, 2013 1:54 am    Post subject: Reply with quote

I'm still using the old MBR partition scheme.

As for the kernel, I'm using 3.8.13-gentoo.

[rant]
This just seems patently insane. The way device names behaved in Linux has worked for many years. The device names were predictable. They didn't just change at fate's whimsy every time a system booted. The new behavior makes it impossible to know what the device names are going to be from one boot to another without--and this is my major gripe--providing an option to retain the old behavior and without providing a practical way to keep the names from changing. I'm all for progress and changes for the better, but this... this is insane! I can't even boot because I don't know what the name of my boot device is going to be!
[/rant]
Back to top
View user's profile Send private message
PaulBredbury
Watchman
Watchman


Joined: 14 Jul 2005
Posts: 7310

PostPosted: Sun Jul 21, 2013 3:20 am    Post subject: Reply with quote

PARTUUID does not need an initrd. Works fine in syslinux:

Code:
LABEL Current
LINUX /boot/3.9.10-x86_64
APPEND root=PARTUUID=00020ed2-01 rootfstype=ext4 usbhid.mousepoll=2 apparmor=1 blah blah

The kernel shows the PARTUUID values on the right-hand side, during bootup.

Edit: Hopefully removed confusion of UUID with PARTUUID.


Last edited by PaulBredbury on Sun Jul 21, 2013 4:15 am; edited 1 time in total
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sun Jul 21, 2013 3:34 am    Post subject: Reply with quote

Now I'm confused as all hell. I've read a lot of documentation that specifically states that you can't just pass the UUID of a partition to the Linux kernel as a boot parameter because the kernel can't interpret it. This explains why it doesn't work with grub.

I don't understand how you're getting it to work.

If I try to specify the root device in menu.lst by UUID, it does not work.

How are you getting it to work?

I suspect that you might be confusing UUID with PARTUUID.
Back to top
View user's profile Send private message
The Doctor
Veteran
Veteran


Joined: 27 Jul 2010
Posts: 1266

PostPosted: Sun Jul 21, 2013 4:05 am    Post subject: Reply with quote

Observation: I don't think writing a udev rule is going to do anything because if your kernel can't mount your root partition udev and your rule will not even be loaded as they reside on your root partition. The only way udev will play any role in this is if you are using an initramfs with udev in which case you may as well mount your root partition directly.

Short term possibility: Disconnect your external drives to see if that helps. If the names are still switching randomly at least you will have a 50% of booting.

Oh, and PaulBredbury is using syslinux instead of grub. It may be worth trying a different boot loader to see if that is the problem. Syslinux doesn't have as many features as the new grub, which I find to be a distinct advantage because it makes it much easer to use.
_________________
First things first, but not necessarily in that order.
Back to top
View user's profile Send private message
PaulBredbury
Watchman
Watchman


Joined: 14 Jul 2005
Posts: 7310

PostPosted: Sun Jul 21, 2013 4:14 am    Post subject: Reply with quote

exhausted wrote:
confusing UUID with PARTUUID.

I suppose I am :oops:

So, why don't you forget about udev rules (which run too late to be helpful) and use PARTUUID ;)

I know that PARTUUID works, because my USB-connected phone steals the sda name if it's plugged in during boot 8O What is Linus thinking??
Back to top
View user's profile Send private message
Hu
Watchman
Watchman


Joined: 06 Mar 2007
Posts: 8602

PostPosted: Sun Jul 21, 2013 4:21 am    Post subject: Reply with quote

exhausted wrote:
I'm still using the old MBR partition scheme.

As for the kernel, I'm using 3.8.13-gentoo.

[rant]
This just seems patently insane. The way device names behaved in Linux has worked for many years. The device names were predictable. They didn't just change at fate's whimsy every time a system booted. The new behavior makes it impossible to know what the device names are going to be from one boot to another without--and this is my major gripe--providing an option to retain the old behavior and without providing a practical way to keep the names from changing. I'm all for progress and changes for the better, but this... this is insane! I can't even boot because I don't know what the name of my boot device is going to be!
[/rant]
When did this break for you? I am not aware of any recent changes in the kernel rules for how to name SCSI/SATA devices. However, some systems have been known to exhibit a random discovery order, particularly when using external USB devices.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 31365
Location: 56N 3W

PostPosted: Sun Jul 21, 2013 10:58 am    Post subject: Reply with quote

exhausted,

You didn't explain the storage devices attached your your system and how they are connected.

Your lspci output would be useful too.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
dwbowyer
Apprentice
Apprentice


Joined: 18 Apr 2008
Posts: 154

PostPosted: Sun Jul 21, 2013 10:32 pm    Post subject: Reply with quote

Not sure this helps OP, but might point in the right direction:

On some systems, Mixed PATA (legacy IDE) and SATA internal drives can exhibit this behavior too, if you have unplugged and replugged one of them. It's not random though, as the names just swap. I've had to unplug all drives and plug them back in, in the order I've wanted them named. It's also why it's not advised to mix CONFIG_IDE in the kernel along with the SATA drivers.
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sat Aug 03, 2013 9:24 pm    Post subject: Reply with quote

Everything was perfect until an update. I believe that it was either a kernel update or a udev update that caused the problem. The kernel is no longer assigning the name /dev/sda to the boot device. I must be able to specify the boot partition by device name; I can't use UUID or anything else. If the kernel doesn't assign the correct name to the boot device, I can't boot.

NeddySeagoon wrote:
You didn't explain the storage devices attached your your system and how they are connected.

My apologies. My storage devices are two solid state drives connected via SATA and a couple of external hard drives connected via USB. The root device is a SATA SSD which has always been named sda.

Here's my lspci output:


Code:
00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07) (prog-if 00 [Normal decode])
   Flags: bus master, 66MHz, medium devsel, latency 32
   Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
   I/O behind bridge: 0000a000-0000bfff
   Memory behind bridge: fa400000-fa5fffff
   Capabilities: [c0] HyperTransport: Slave or Primary Interface
   Capabilities: [f0] HyperTransport: Interrupt Discovery and Configuration

00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)
   Subsystem: Advanced Micro Devices [AMD] AMD-8111 LPC
   Flags: bus master, 66MHz, medium devsel, latency 0

00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03) (prog-if 8a [Master SecP PriP])
   Subsystem: Advanced Micro Devices [AMD] AMD-8111 IDE
   Flags: medium devsel
   [virtual] Memory at 000001f0 (32-bit, non-prefetchable) [size=8]
   [virtual] Memory at 000003f0 (type 3, non-prefetchable)
   [virtual] Memory at 00000170 (32-bit, non-prefetchable) [size=8]
   [virtual] Memory at 00000370 (type 3, non-prefetchable)
   I/O ports at ffa0 [size=16]

00:07.2 SMBus: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0 (rev 02)
   Subsystem: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0
   Flags: medium devsel, IRQ 9
   I/O ports at c480 [size=32]

00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)
   Subsystem: Advanced Micro Devices [AMD] AMD-8111 ACPI
   Flags: medium devsel

00:07.5 Multimedia audio controller: Advanced Micro Devices [AMD] AMD-8111 AC97 Audio (rev 03)
   Subsystem: Tyan Computer Device 2885
   Flags: bus master, medium devsel, latency 32, IRQ 17
   I/O ports at c800 [size=256]
   I/O ports at cc00 [size=64]
   Kernel driver in use: snd_intel8x0

00:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
   Flags: bus master, 66MHz, medium devsel, latency 32
   Bus: primary=00, secondary=02, subordinate=04, sec-latency=32
   Memory behind bridge: fa600000-fa8fffff
   Prefetchable memory behind bridge: 00000000ca000000-00000000ca1fffff
   Capabilities: [a0] PCI-X bridge device
   Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration
   Capabilities: [c0] HyperTransport: Slave or Primary Interface

00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
   Subsystem: Advanced Micro Devices [AMD] Device 36c0
   Flags: bus master, medium devsel, latency 0
   Memory at fa9ff000 (64-bit, non-prefetchable) [size=4K]

00:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
   Flags: bus master, 66MHz, medium devsel, latency 32
   Bus: primary=00, secondary=05, subordinate=05, sec-latency=32
   Capabilities: [a0] PCI-X bridge device
   Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration

00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
   Subsystem: Advanced Micro Devices [AMD] Device 36c0
   Flags: bus master, medium devsel, latency 0
   Memory at fa9fe000 (64-bit, non-prefetchable) [size=4K]

00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
   Flags: fast devsel
   Capabilities: [80] HyperTransport: Host or Secondary Interface
   Capabilities: [a0] HyperTransport: Host or Secondary Interface
   Capabilities: [c0] HyperTransport: Host or Secondary Interface

00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
   Flags: fast devsel

00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
   Flags: fast devsel

00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
   Flags: fast devsel
   Kernel driver in use: k8temp

01:00.0 USB controller: Advanced Micro Devices [AMD] AMD-8111 USB OHCI (rev 0b) (prog-if 10 [OHCI])
   Subsystem: Advanced Micro Devices [AMD] AMD-8111 USB OHCI
   Flags: bus master, medium devsel, latency 32, IRQ 19
   Memory at fa5fd000 (32-bit, non-prefetchable) [size=4K]
   Kernel driver in use: ohci_hcd

01:00.1 USB controller: Advanced Micro Devices [AMD] AMD-8111 USB OHCI (rev 0b) (prog-if 10 [OHCI])
   Subsystem: Advanced Micro Devices [AMD] AMD-8111 USB OHCI
   Flags: bus master, medium devsel, latency 32, IRQ 19
   Memory at fa5fe000 (32-bit, non-prefetchable) [size=4K]
   Kernel driver in use: ohci_hcd

01:0a.0 Multimedia audio controller: VIA Technologies Inc. ICE1712 [Envy24] PCI Multi-Channel I/O Controller (rev 02)
   Subsystem: VIA Technologies Inc. M-Audio Delta 1010
   Flags: bus master, medium devsel, latency 32, IRQ 16
   I/O ports at b080 [size=32]
   I/O ports at b000 [size=16]
   I/O ports at ac00 [size=16]
   I/O ports at a880 [size=64]
   Capabilities: [80] Power Management version 1
   Kernel driver in use: snd_ice1712

01:0b.0 Mass storage controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
   Subsystem: Silicon Image, Inc. SiI 3114 SATALink Controller
   Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17
   I/O ports at bc00 [size=8]
   I/O ports at b880 [size=4]
   I/O ports at b800 [size=8]
   I/O ports at b480 [size=4]
   I/O ports at b400 [size=16]
   Memory at fa5ffc00 (32-bit, non-prefetchable) [size=1K]
   Expansion ROM at fa500000 [disabled] [size=512K]
   Capabilities: [60] Power Management version 2
   Kernel driver in use: sata_sil

02:07.0 PCI bridge: Hint Corp HB6 Universal PCI-PCI bridge (non-transparent mode) (rev 15) (prog-if 00 [Normal decode])
   Flags: bus master, medium devsel, latency 32
   Bus: primary=02, secondary=03, subordinate=03, sec-latency=32
   Memory behind bridge: fa600000-fa6fffff
   Capabilities: [80] Power Management version 2
   Capabilities: [90] CompactPCI hot-swap <?>
   Capabilities: [a0] Vital Product Data

02:08.0 PCI bridge: Pericom Semiconductor Device e111 (rev 02) (prog-if 00 [Normal decode])
   Flags: bus master, 66MHz, medium devsel, latency 32
   Bus: primary=02, secondary=04, subordinate=04, sec-latency=0
   Memory behind bridge: fa700000-fa7fffff
   Prefetchable memory behind bridge: 00000000ca000000-00000000ca0fffff
   Capabilities: [80] PCI-X bridge device
   Capabilities: [a8] Subsystem: Device 0000:0000
   Capabilities: [b0] Express PCI/PCI-X to PCI-Express Bridge, MSI 00
   Capabilities: [d8] Vital Product Data
   Capabilities: [f0] MSI: Enable- Count=1/1 Maskable- 64bit+

02:09.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5703X Gigabit Ethernet (rev 02)
   Subsystem: Tyan Computer Device 2885
   Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 24
   Memory at fa8e0000 (64-bit, non-prefetchable) [size=64K]
   Expansion ROM at fa8b0000 [disabled] [size=64K]
   Capabilities: [40] PCI-X non-bridge device
   Capabilities: [48] Power Management version 2
   Capabilities: [50] Vital Product Data
   Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
   Kernel driver in use: tg3

03:00.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01) (prog-if 10 [OHCI])
   Subsystem: AFAVLAB Technology Inc Device 702a
   Flags: bus master, medium devsel, latency 32, IRQ 15
   Memory at fa6ff800 (32-bit, non-prefetchable) [size=2K]
   Memory at fa6f8000 (32-bit, non-prefetchable) [size=16K]
   Capabilities: [44] Power Management version 2

03:01.0 USB controller: NEC Corporation OHCI USB Controller (rev 43) (prog-if 10 [OHCI])
   Subsystem: Siig Inc Device 131f
   Flags: bus master, medium devsel, latency 32, IRQ 27
   Memory at fa6fd000 (32-bit, non-prefetchable) [size=4K]
   Capabilities: [40] Power Management version 2
   Kernel driver in use: ohci_hcd

03:01.1 USB controller: NEC Corporation OHCI USB Controller (rev 43) (prog-if 10 [OHCI])
   Subsystem: Siig Inc Device 131f
   Flags: bus master, medium devsel, latency 32, IRQ 24
   Memory at fa6fe000 (32-bit, non-prefetchable) [size=4K]
   Capabilities: [40] Power Management version 2
   Kernel driver in use: ohci_hcd

03:01.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04) (prog-if 20 [EHCI])
   Subsystem: Siig Inc Device 00e0
   Flags: bus master, medium devsel, latency 32, IRQ 25
   Memory at fa6ff400 (32-bit, non-prefetchable) [size=256]
   Capabilities: [40] Power Management version 2
   Kernel driver in use: ehci_hcd

04:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) (prog-if 30 [XHCI])
   Subsystem: NEC Corporation uPD720200 USB 3.0 Host Controller
   Flags: bus master, fast devsel, latency 0, IRQ 27
   Memory at fa7fe000 (64-bit, non-prefetchable) [size=8K]
   Capabilities: [50] Power Management version 3
   Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
   Capabilities: [90] MSI-X: Enable- Count=8 Masked-
   Capabilities: [a0] Express Endpoint, MSI 00
   Kernel driver in use: xhci_hcd

06:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-8151 System Controller (rev 14)
   Subsystem: Advanced Micro Devices [AMD] AMD-8151 System Controller
   Flags: bus master, medium devsel, latency 0
   Memory at <ignored> (32-bit, prefetchable) [size=128M]
   Capabilities: [a0] AGP version 3.0
   Capabilities: [c0] HyperTransport: Slave or Primary Interface
   Kernel driver in use: agpgart-amd64

06:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8151 AGP Bridge (rev 14) (prog-if 00 [Normal decode])
   Flags: bus master, 66MHz, medium devsel, latency 32
   Bus: primary=06, secondary=07, subordinate=07, sec-latency=32
   Memory behind bridge: faa00000-feafffff
   Prefetchable memory behind bridge: ca300000-ea2fffff

07:00.0 VGA compatible controller: NVIDIA Corporation NV40 [GeForce 6800 Ultra] (rev a1) (prog-if 00 [VGA controller])
   Flags: bus master, 66MHz, medium devsel, latency 248, IRQ 16
   Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
   Memory at d0000000 (32-bit, prefetchable) [size=256M]
   Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
   [virtual] Expansion ROM at feae0000 [disabled] [size=128K]
   Capabilities: [60] Power Management version 2
   Capabilities: [44] AGP version 3.0
   Kernel driver in use: nvidia
   Kernel modules: nvidia
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sat Aug 03, 2013 9:33 pm    Post subject: Reply with quote

I've been working on this off and on for so long, I'm probably not far away from giving up. I suspect that there are several courses of action I could try:

  • Try downgrading udev and/or the kernel. This can't be a very good option. I can't hang on to some old version of udev and/or kernel forever.

  • Reinstall from scratch. I really don't want to do that. Even if I reinstall from scratch, what would prevent this exact same problem from happening again?

  • Try upgrading from a backup. I have a complete backup of my system. Unfortunately, that backup is a year old. (I'm currently running that backup at the moment.) Would it be practical to reload from a year-old backup and try updating it? There's also the risk that I'd run into the same problem after updating the kernel and/or udev, whichever it was that screwed up my system.

Do any of those options seem like a good idea?
Back to top
View user's profile Send private message
The Doctor
Veteran
Veteran


Joined: 27 Jul 2010
Posts: 1266

PostPosted: Sat Aug 03, 2013 9:44 pm    Post subject: Reply with quote

Quote:
Do any of those options seem like a good idea?


Not really. You can't update a year old install. Installing from scratch probably won't do it since it won't fix the problem. Downgrading the kernel may help. As I pointed out before, udev isn't a player if you can't mount your root since it resides there.

Better: unplug you external drives and see if that helps.

Or: Play with using a PARTUUID for root. As PaulBredbury said, it works and you can boot with it. You can use UUID for everything else.
_________________
First things first, but not necessarily in that order.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 31365
Location: 56N 3W

PostPosted: Sat Aug 03, 2013 9:45 pm    Post subject: Reply with quote

exhausted,

Nope, none of those are good ideas, for the reasons you listed.
Updating a one year old Gentoo is an interesting intellectual exercise but a reinstall would be faster.

Explain what storage devices you have attached to your system and the physical attachment, e.g. USB, SATA, PATA, SCSI ...

Also post the output of /sbin/blkid
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ulenrich
Veteran
Veteran


Joined: 10 Oct 2010
Posts: 1122

PostPosted: Sat Aug 03, 2013 10:36 pm    Post subject: Reply with quote

Truely, this is an exhausting story. If your issue would be correctly diagnosed you will probably lough out loud about !

The cause could be a very minor behavioral error on your side: If you for example used some backup method on the partition level and duplicated UUIDs and labels by restoring to another partition,
or something like that ...
_________________
fun2gen2
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sat Aug 03, 2013 10:52 pm    Post subject: Reply with quote

The Doctor wrote:
unplug you external drives and see if that helps.

That's a great suggestion. I did try it, though, to no avail.

The Doctor wrote:
Play with using a PARTUUID for root.

If I understand correctly, I can try the following:

1. Recompile the 3.3.8 kernel on my working year-old system installed from backup) to support PARTUUIDs.

2. Boot the recompiled kernel to find out what the PARTUUIDs are.

3. Chroot into the broken installation, recompile the broken installation's kernel to support PARTUUIDs.

4. Edit the broken installation's fstab to specify / by PARTUUID. (This step isn't necessary at all, is it?)

5. Edit grub's menu.lst to pass
Code:
root=whatever the PARUUID turns out to be
to the kernel via GRUB.


I'll wait a bit to see anybody sees any flaws in this plan and then I'll try it, probably tomorrow morning.



Quote:
Explain what storage devices you have attached to your system and the physical attachment, e.g. USB, SATA, PATA, SCSI ...

I must be really thick. I've stated that I have two SSDs attached via SATA and two external HDDs attached via USB. This isn't the information you're asking for, is it? (I greatly appreciate your patience. I'm actually a computing technology veteran, but I'm definitely not the sharpest tool in the shed.)
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sat Aug 03, 2013 10:56 pm    Post subject: Reply with quote

ulenrich wrote:
Truely, this is an exhausting story. If your issue would be correctly diagnosed you will probably lough out loud about !

Yes, I suspect that you are quite right--this problem might very well turn out to have a truly ridiculous cause.
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Sat Aug 03, 2013 11:03 pm    Post subject: Reply with quote

It's got to be the kernel. Obviously, (as The Doctor has already pointed out) udev has nothing to do with this. It's got everything to do with how the newer kernel deals with the hardware. There were no hardware or firmware changes. I am absolutely certain of that. The kernel just isn't behaving the same when it comes to assigning bus names.

UPDATE:

I chrooted into the broken installation and compiled a 3.3.8-gentoo kernel for it. I built and installed the kernels and modules. I was able to boot the broken installation using the 3.3.8 kernel!

Everything seemed perfectly fine until about five or six minutes later: The system spontaneously rebooted. ARGH!

I have verified that I can boot the broken installation using an older kernel. Older kernels assign the expected /dev/sda5 bus name to the root partition. However, the system is apparently unstable when booted using an older kernel. It will work for a few minutes, then spontaneously reboot.

I checked my /var/log/messages file (I'm using syslog-ng). Everything looks normal to me except for a machine check error. Here's the last several lines of the log:

Code:
Aug  4 00:13:02 amd64-at login[2833]: ROOT LOGIN  on '/dev/tty2'
Aug  4 00:14:21 amd64-at acpid: client connected from 2870[0:0]
Aug  4 00:14:21 amd64-at acpid: 1 client rule loaded
Aug  4 00:15:03 amd64-at ntpd_intres[2751]: host name not found: 4.ntp.bytestacker.com
Aug  4 00:15:03 amd64-at ntpd_intres[2751]: host name not found: 5.ticker.cis.sac.accd.edu
Aug  4 00:15:04 amd64-at ntpd_intres[2751]: host name not found: 6.sundial.cis.sac.accd.edu
Aug  4 00:15:04 amd64-at ntpd_intres[2751]: host name not found: 7.ntppub.tamu.edu
Aug  4 00:15:05 amd64-at ntpd_intres[2751]: host name not found: 8.chrono.cis.sac.accd.edu
Aug  4 00:15:05 amd64-at ntpd_intres[2751]: host name not found: 9.tick.jpunix.net
Aug  4 00:15:06 amd64-at ntpd_intres[2751]: host name not found: 10.ntp.tmc.edu
Aug  4 00:15:06 amd64-at ntpd_intres[2751]: host name not found: 11.ac-ntp1.net.cmu.edu
Aug  4 00:15:07 amd64-at ntpd_intres[2751]: host name not found: 12.ac-ntp0.net.cmu.edu
Aug  4 00:15:07 amd64-at ntpd_intres[2751]: host name not found: 13.ac-ntp2.net.cmu.edu
Aug  4 00:15:47 amd64-at kernel: mtrr: no MTRR for d0000000,10000000 found
Aug  4 00:15:58 amd64-at acpid: client 2870[0:0] has disconnected
Aug  4 00:15:58 amd64-at acpid: client connected from 2981[0:0]
Aug  4 00:15:58 amd64-at acpid: 1 client rule loaded
Aug  4 00:16:07 amd64-at ntpd_intres[2751]: parent died before we finished, exiting

Aug  4 00:17:26 amd64-at kernel: [Hardware Error]: Machine check events logged

Aug  4 00:18:15 amd64-at acpid: client 2981[0:0] has disconnected
Aug  4 00:18:15 amd64-at acpid: client connected from 3017[0:0]
Aug  4 00:18:15 amd64-at acpid: 1 client rule loaded
Aug  4 00:19:01 amd64-at acpid: client 3017[0:0] has disconnected
Aug  4 00:19:01 amd64-at acpid: client connected from 3044[0:0]
Aug  4 00:19:01 amd64-at acpid: 1 client rule loaded
Aug  4 00:19:38 amd64-at acpid: client 3044[0:0] has disconnected
Aug  4 00:19:38 amd64-at acpid: client connected from 3071[0:0]
Aug  4 00:19:38 amd64-at acpid: 1 client rule loaded
Aug  4 00:23:53 amd64-at kernel: mtrr: no MTRR for d0000000,10000000 found
Aug  4 00:24:00 amd64-at acpid: client 3071[0:0] has disconnected
Aug  4 00:24:00 amd64-at acpid: client connected from 3159[0:0]
Aug  4 00:24:00 amd64-at acpid: 1 client rule loaded
Aug  4 00:31:43 amd64-at acpid: client 3159[0:0] has disconnected
Aug  4 00:31:43 amd64-at acpid: client connected from 3186[0:0]
Aug  4 00:31:43 amd64-at acpid: 1 client rule loaded
Aug  4 00:32:50 amd64-at acpid: client 3186[0:0] has disconnected
Aug  4 00:32:50 amd64-at acpid: client connected from 3214[0:0]
Aug  4 00:32:50 amd64-at acpid: 1 client rule loaded


What the heck? A machine check exception? I'm inclined to believe that this is not actually a hardware fault. This system has run nonstop for about a week using my backup Gentoo installation with no sign of any hardware problems.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 31365
Location: 56N 3W

PostPosted: Sun Aug 04, 2013 11:04 am    Post subject: Reply with quote

exhausted,

Your PARTUUID is sound provided that 3.3.8 supports PARTUUIDs for anything other then GPT.
Its fairly new for MSDOS Partition tables.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
exhausted
n00b
n00b


Joined: 20 Jul 2013
Posts: 21
Location: gdk_pixbuf Hell

PostPosted: Thu Apr 03, 2014 1:47 am    Post subject: Reply with quote

After many months of trying to solve this problem, I deem this unsolvable.

It's truly a freakish problem. The Linux kernel appears to be assigning device names unpredictably and changing up the names with every boot. Every boot, it's essentially a roll of the dice. This only affects newer versions of the kernel.

I was forced to wipe the SSD and install Gentoo from scratch, which probably turned out to be a great idea. The previous installation was from 2005 and had built up a great deal of cruft. There were lots of configuration files that are no longer used, different files used for some things, the location of some files have changed--there's just been a lot that's happened since 2005. Installing from scratch got me a much cleaner system.

The new installation uses the latest stable kernel from gentoo-sources with no problems. sda is always sda, sdb is always sdb, et cetera.

Many thanks to everybody for their help with this weird issue.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum