Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
LVM børked; how do I rebuild? [SOLVED]
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Thu Apr 26, 2012 5:49 pm    Post subject: Reply with quote

openrc is 0.9.9.2.

So, since the RAID5 array isn't loading right now, is it safe to assume that this is because mdadm and lvm aren't built static? If that's the case, I should be able to fix it by booting a liveCD, mounting everything and chrooting in, then simply emerging them as static, yes? Will I need to copy any binaries over to /sbin, or does emerge take care of that?

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 29963
Location: 56N 3W

PostPosted: Thu Apr 26, 2012 6:19 pm    Post subject: Reply with quote

ExecutorElassus,

What version is udev?

If you type
Code:
mount -a
does everything mount?
Building busybox, lvm and mdadm with USE=static is not enouogh. You have to get the new binaries into your initrd too.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Thu Apr 26, 2012 9:27 pm    Post subject: Reply with quote

Hey Neddy,

I'll have to wait until I get back to my home office Monday evening before I can try mounting everything (I have a music festival this weekend), but I'll report my results once I try.

udev is blocked as per your instructions past 181 (or whatever version starts treating mounting more strictly).

Cheers,

EE
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Mon Apr 30, 2012 11:24 am    Post subject: Reply with quote

Hi Neddy,

so, first: 'mount -a' mounts all the drives/partitions without a hitch. No errors in the console, nothing in dmesg. I have udev-171-r5 running.

Now that I have a pager, I can read the dmesg log. I don't see anything too strange. I see complaints about "invalid raid superblock magic on sdX4" which I just take to mean that it isn't v0.90 (and md then says it it consequently not importing the superblock). md126 (the root RAID1 mirror across the three sdX3 partitions) is listed as having an unknown partition table (which makes sense, since it's a logical partition, right?).

Then there is a long gap of info about USB, and udev then starts, before md127 is loaded. md127 appears to load fine (but its included partitions are not mounted).

One more thing: on shutdown ('init 0' or 'init 6') I get an error that md "cannot get exclusive access to md126 [the md array holding / ]," and that the array fails to stop. Could that be causing issues?

Anyway, should I now try to rebuild mdadm, lvm, and busybox as static, and make an initrd?

Thanks,

EE
PS, I notice that if I drop to runlevel 1 - thus shutting down the md arrays - and then reboot, sda4 gets punted off into a single-drive array - md125 - and md127 is stopped (meaning that I have to stop md125, restart md127 degraded, then add /dev/sda4 to it and sit back for a six-hour re-sync). Is there a reason for this? Is there a mismatched UUID somewhere that makes mdadm think the drives are in different arrays? If so, how do I fix it?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 29963
Location: 56N 3W

PostPosted: Mon Apr 30, 2012 6:06 pm    Post subject: Reply with quote

ExecutorElassus,

I've been there - it all works when you do it by hand.
I *think* but its too difficult to prove, that your openrc is no longer tolerant of udev failures, which have existd for a long time but which are no longer retried.
/usr is not mounted when uden starts, lots of udev things fail and the retries to piuck up the pieces are no longer tried.

The only way is forward, since you need an initrd anyway.
Thats an improved howto over the one I posted earlier in thise thread.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Mon Apr 30, 2012 6:21 pm    Post subject: Reply with quote

Hi Neddy,

actually, I think most of this started when I tried to use the earlymount script - I linked it earlier in the thread; it attempts to pre-mount RAID arrays without an initrd - and something *went wrong*.

Anyway, heaven help me, I'll start following your guide once the re-sync is done (in about an hour), and let you know what happens.

Cheers,

EE
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Mon Apr 30, 2012 8:41 pm    Post subject: Reply with quote

also, I note that my fstab is listing all my RAID partitions as "/dev/vg/[path]" and not "/dev/mapper/vg-[path-shorthand];" mght that be causing issues?

No matter: I'm replacing them all with UUIDs now, as per your guide. I'll report issues on that thread, and anything that appears to be my system acting schizoid on this one.

Cheers,

EE
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Mon Apr 30, 2012 9:58 pm    Post subject: Reply with quote

So, as an example of my system acting wonky, here's what I get when I run 'ldd /sbin/fsck.ext2:"
Code:
# ldd /sbin/fsck.ext2
   linux-vdso.so.1 =>  (0x00007fff051ff000)
   libext2fs.so.2 => /lib64/libext2fs.so.2 (0x00007fe85174b000)
   libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007fe851547000)
   libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fe851320000)
   libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fe85111b000)
   libe2p.so.2 => /lib64/libe2p.so.2 (0x00007fe850f13000)
   libc.so.6 => /lib64/libc.so.6 (0x00007fe850b88000)
   libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fe85096b000)
   /lib64/ld-linux-x86-64.so.2 (0x00007fe85198e000)
Note the first entry: it appears to be linking somewhere, but has no destination. I can't find the file anywhere, and e2fsprogs emerges fine, so I'm not sure what that is. Any ideas?

thanks,

EE
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Mon Apr 30, 2012 10:17 pm    Post subject: Reply with quote

And more on the side of strangeness in setting upyour script:

is there a reason why one of the arrays - specifically, the large one holding everything beyond / and /boot - would have a non-hex UUID? Check this out:

Code:
# blkid
--SNIP--
/dev/md126: UUID="74d54c6f-6a2d-47a6-acf3-5a902d13899f" TYPE="ext3"
/dev/md1: UUID="8d1b95b6-6e06-48a7-946a-3b739c8ee637" TYPE="ext2"
/dev/md127: UUID="P1IbQY-JpO7-uBWA-5Jyr-hnRj-jB9S-LbIdsZ" TYPE="LVM2_member"

Note that md127 uses a different numbering system for the UUID. Should I care? Does that have something to do with it being lvm2?

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 29963
Location: 56N 3W

PostPosted: Mon Apr 30, 2012 10:28 pm    Post subject: Reply with quote

ExecutorElassus,

Code:
/dev/md127: UUID="P1IbQY-JpO7-uBWA-5Jyr-hnRj-jB9S-LbIdsZ" TYPE="LVM2_member"
its a piece of a lvm2 physical volume.
You don't put that in /etc/fstab as you can't mount it.

Your md1 and md126 hold ext2 and ext3 filesystems. md127 its a lvm member which will contain filesystems in the individual logical volumes.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Mon Apr 30, 2012 10:35 pm    Post subject: Reply with quote

so, in the example script, where you have:

Code:
# assemble the raid set(s) - they got renumbered from md1, md5 and md6
# /boot
/sbin/mdadm --assemble /dev/md125 --uuid=d678d02e-28ab-84e0-c44c-77eb7ee19756
# don't care if /boot fails to assemble

# /  (root)  I wimped out of root on lvm for this box
/sbin/mdadm --assemble /dev/md126 --uuid=ad5fe0cb-775d-38b4-7169-e221fc96089f || rescue_shell
# if root won't assemble, we are stuck

# LVM for everything else
/sbin/mdadm --assemble /dev/md127 --uuid=52be4797:edab2349:eb21497e:52035eaa || rescue_shell
# and if the LVM space won't assemble there is no /usr or /var so we are really in a mess
# TODO could auto cope with degraded raid operation
I would only modify the first two as appropriate, and comment out the third?

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 29963
Location: 56N 3W

PostPosted: Mon Apr 30, 2012 10:47 pm    Post subject: Reply with quote

ExecutorElassus,

You need the UUID of the raid set, not the LVM2 it caries when you assemble the raid.
The safest way to get the UUID of the raid set is to ask madam

Code:
mdadm -E /dev/sda1
will show the UUID of the raid set that /dev/sda1 belongs to.
Its these UUIDs you need to feed to mdadm to assemble the raid sets, not the UUID of any filesystes or LVM2 physical volumes they may carry.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Mon Apr 30, 2012 11:00 pm    Post subject: Reply with quote

Hi Neddy,

ah, that's what confused me: the fourth partition on each member drive is a member of a large RAID5 array, which is what is made into /dev/md127. However, that array - as it's an LVM array - does not return a UUID to blkid. That might be useful to add to the guide, since I got confused.

Cheers,

EE
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Tue May 01, 2012 12:06 am    Post subject: Reply with quote

Hi Neddy,

I posted to the your guide, but moved it here instead because it seems more to do with my system acting bizarre.

So, I failed to boot, and got dumped to a shell. mdraid is not starting, and consequently the root partition - or, for that matter, /boot - are not getting mounted, and everything stops. I'm getting error messages about being unable to find a boot disk.

I should note: my / is on a RAID1 array; mirrored across the three partitions. Does that make a difference for the UUID from your setup? I changed the kernel line in grub.conf to use the UUID for the md array on which / resides, but it doesn't seem to work (nor if I set it to /dev/md126) .

Any suggestions?

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 29963
Location: 56N 3W

PostPosted: Tue May 01, 2012 7:37 pm    Post subject: Reply with quote

ExecutorElassus,

The mdadm --assembe calls need the UUID of the raid, as determined by mdadm -E /dev/<raid_member>
All members of the same raid set carry the same UUID, which is how mdasm finds them
Code:
# mdadm -E /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 9392926d:64086e7a:86638283:4138a597
...
# mdadm -E /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 9392926d:64086e7a:86638283:4138a597
...
That two elements of my four element raid1 /boot.
Feed your corresponding UUID(s) to mdadm to get the raid set(s) assembled. Once the raid is assembled, those UUIDs are of no further interest.
blkid shows
Code:
/dev/md125: UUID="741183c2-1392-4022-a1d3-d0af8ba4a2a8" TYPE="ext2"

/dev/md125 is my /boot. It contains an ext2 filesystem with UUID 741183c2-1392-4022-a1d3-d0af8ba4a2a8, which is the UUID needed to mount /boot

root is similar, so the UUID you need in root=uuid= is that of the filesystem, on the root block device, not the UUID of the underlying raid.

mdraid does not start in the initrd. It runs and exits each time its called to assemble a raid set.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Tue May 01, 2012 8:24 pm    Post subject: Reply with quote

Hi Neddy,

I was feeding the bootloader the UUID for the / md filesystem as returned by blkid (that is, what the UUID for "/dev/md126" was showing), with no luck.

Also, is there a reason why the first three partitions return a differently-formatted output for "mdadm -E" than the fourth? The three other partitions all return a table at the bottom listing all three devices, major and minor number, raid device, and state; but 'mdadm -E /dev/sdX4' returns the lines for Device Role and Array State as the last two lines. That may not mean anything (or maybe it's because they're in an lvm group? Or it's a v1.2 superblock?). The sdX4 are also not showing up on the blkid list while in the busybox shell.

'mdadm -A' seems to require the UUID of the array, not of the constituent devices. Is there a flag I use to specify them? None of the devices for sdX3 (those which would constitute / on /dev/md126) have an array UUID returned by 'mdadm -E' so the only UUID I know for this array is that of the filesystem. Is there another way to get to it?

thanks,

EE
UPDATE: I assembled it just using the device nodes (ie, 'mdadm -A /dev/md126 /dev/sda3 /dev/sdb3 /dev/sdc3'), and now it's active. However, the UUID it shows is exactly the one I have in the bootloader command line. Any guess why it wouldn't be working?
UPDATE 2: okay, I've rebooted, and I can assemble all my md arrays in the busybox shell. here's what happens when I run 'blkid':
Code:
#blkid
/dev/md126: UUID="74d54c6f-6a2d-47a6-acf3-5a902d13899f" TYPE="ext3"
/dev/md1: UUID="8d1b95b6-6e06-48a7-946a-3b739c8ee637" TYPE="ext2"
/dev/sdc1: UUID …
Even though md127 is assembled, it does not show in blkid. Is that because it's an lvm device?

Last edited by ExecutorElassus on Tue May 01, 2012 10:09 pm; edited 3 times in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 29963
Location: 56N 3W

PostPosted: Tue May 01, 2012 8:36 pm    Post subject: Reply with quote

ExecutorElassus,

Show me the output of blkid and mdam -E /dev/sda[1234]
Sight of your initscript would also be good.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Tue May 01, 2012 9:46 pm    Post subject: Reply with quote

Hi Neddy,

blkid output (from within busybox, after assembling the three md arrays) is in "UPDATE 2" above.

'mdadm -E /dev/sda1' (copied manually, please kill me):
Code:
/dev/sda1
         Magic : a92b4efc
       Version : 0.90.00
         UUID  : UUID=707f4cba:12af970b:cb201669:f728008a
Creation Time : Mon Apr 25 18:48:43 2011
   Raid Level : raid1
Used Dev Size: 97536 (95.27 MiB 99.88 MB)
    Array size: 97536 (95.27 MiB 99.88 MB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 1

Update Time : Mon Apr 30 23:27:56 2012
          State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : 8bf8aa1f - correct
Events : 67

       Number       Major       Minor           RaidDevice  State
this    0                 8             1                     0            active sync    /dev/sda1

0       0                  8            1                      0           active sync    /dev/sda1
1       1                  8            33                    1           active sync    /dev/sdc1
2       2                  8            17                    2           active sync    /dev/sdb1

'mdadm -E /dev/sda3' (sdX2 are all swap partitions, and not in raid arrays; this array is where / resides):
Code:
/dev/sda3
         Magic : a92b4efc
       Version : 0.90.00
         UUID  : UUID=23a73541:b1ad3343:cb201669:f728008a
Creation Time : Mon Apr 25 18:48:43 2011
   Raid Level : raid1
Used Dev Size: 9765504 (9.31 GiB 10.00 GB)
    Array size: 9765504 (9.31 GiB 10.00 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 126

Update Time : Mon Apr 30 23:28:01 2012
          State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : deb1da64 - correct
Events : 4840

       Number       Major       Minor           RaidDevice  State
this    0                 8             3                     0            active sync    /dev/sda3

0       0                  8            3                      0           active sync    /dev/sda3
1       1                  8            35                    1           active sync    /dev/sdc3
2       2                  8            19                    2           active sync    /dev/sdb3

'mdadm -E /dev/sda4' (this is the lvm for everything else):
Code:
/dev/sda4
         Magic : a92b4efc
       Version : 1.2
Feature Map : 0x0
Array UUID  : d42e5336:b75b0144:a502f2a0:178afc11
         Name : domo-kun:carrier
Creation Time : Mon Apr 25 18:48:43 2011
   Raid Level : raid5
raid Devices : 3

Avail Device Size : 19318413841 ( 921.17GiB 989.10 GB)
    Array size: 3863681024 (1842.35 GiB 1978.20 GB)
Used Dev Size: 1931840512 (921.17 GiB 989.10 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
          State: clean
Device UUID . f7f1d49b:a0272bc3:c46251a2:e0502319

Update Time : Mon Apr 30 23:27:56 2012
   Checksum : 383d5a61 - correct
        Events : 27481

       Layout : left-symmetric
Chunk Size :  512K

Device Role : Active Device 0
Array State : AAA ('A' == active, '.' == missing)

That's the mdadm -E info for all partitions that are raid members on sda.

init:
Code:
#!/bin/busybox sh

rescue_shell() {
    echo "$@"
    echo "Something went wrong. Dropping you to a shell."
    /bin/busybox --install -s
    exec /bin/sh
}

# allow the use of UUIDs or filesystem lables
uuidlabel_root() {
    for cmd in $(cat /proc/cmdline) ; do
         case $cmd in
         root=*)
              type=$(echo $cmd | cut -d= -f2)
              echo "Mounting rootfs"
              if [ $type == "LABEL" ] || [ $type == "UUID" ] ; then
                 uuid=$(echo $cmd | cut -d= -f3)
                 mount -o ro $(findfd "$type"=$uuid")  /mnt/root
              else
                 mount -o ro $(echo $cmd | cut -d= -f2) /mnt/root
              fi
              ;;
         esac
    done
}

check_filesystem() {
    # most of code coming from /etc/init.d/fsck

    local fsck_opts= check_extra= RC_UNAME=$(uname -s)

    #FIXME : get_bootparam forecefsck
    if [ -e /forcefsck ]; then
         fsck_opts="$fsck_opts -f"
         check_extra="(check forced)"
    fi

   echo "Checking local filesystem $check_extra : $1"

   if [ "$RC_UNAME" = Linux ]; then
          fsck_opts="$fsck_opts -C0 -T"
   fi

   trap : INT QUIT

   # using out fsck, not the builtin one from busybox
   /sbin/fsck -p $fsck_opts $1

   ret_val=$?
   case $ret_val in
           0)          return 0;;
           1)          echo "Filesystem repaired"; return 0;;
           2|3)       if [  "$RC_UNAME" = Linux ]; then
                                echo "Filesystem repaired, but reboot needed"
                                reboot -f
                         else
                                rescue_shell "Filesystem still have errors; manual fsck required"
                         fi;;
          4)            if [ "$RC_UNAME" = Linux ]; then
                                rescue_shell "filesystem errors left uncorrected, aborting"
                         else
                                echo "Filesystem repaired, but reboot needed"
                                reboot
                         fi;;
         8)             echo "Operational error"; return 0;;
         16)           echo "Use or Syntax Error"; return 16;;
         32)           echo "fsck interrupted";;
         127)         echo "Shared Library Error"; sleep 20; return 0;;
         *)             echo $ret_val; echo "Some random fsck error - continuing anyway"; sleep 20; return 0;;


      esac
# rescue_shell can't find tty so its broken
      rescue_shell
}

# start for real here

# temporarily mount proc and sys
mount -t proc none /proc
mount -t sysfs none /sys
mount -t devtmpfs none /dev

# disable kernel messages from popping onto the screen
###echo 0 > /proc/sys/kernel/printk
# clear the screen
###clear

# assemble the raid set(s) - they got renumbered from md1, md5 and md6
#/boot
/sbin/mdadm --assemble /dev/md1 --uuid=8d1b95b6-6e06-48a7-946a-3b739c8ee637
# don't care if /boot fails to assemble

# / (root) I wimped out of root on lvm for this box
/sbin/mdadm --assemble /dev/md126 --uuid=74d54c6f-6a2d-47a6-acf3-5a902d13899f || rescue_shell
# if root won't assemble, we are stuck

# LVM for everything else
/sbin/mdadm --assemble /dev/md127 --uuid=d42e5336:b75b0144:a502f2a0:178afc11 || rescue_shell
# and if the LVM space won't assemble there is no /usr or /var so we are really in a mess
# TODO could auto cope with degraded raid operation

# lvm runs as whatever its called as and we need vgchange
ln -s /sbin/lvm.static /sbin/vgchange

# start the vg volume group - we only have one volume group
/sbin/vgchange -ay vg || rescue_shell
# if this failed we have no /usr or /var

# get here with raid sets assembled and logical volumes available

# mounting rootfs on /mnt/root
uuidlabel_root || rescue_shell "Error with uuidlabel_root"

# space separated list of mountopoints that …
mountpoints="/usr /usr/portage /usr/portage/distfiles /var /var/tmp /home /opt /tmp"

# … we want to find in /etc/fstab …
ln -s /mnt/root/etc/fstab /etc/fstab

# … to check filesystems and mount our devices.
for m in $mountpoints ; do

#echo $m

   check_filesystems $m

   echo "Mounting $m"
   # mount the device and …
   mount $m || rescue_shell "Error while mounting $m"

   # … move the tree to its final location
   mount --move $m "/mnt/root"$m || rescue_shell "Error while moving $m"
done

echo "All done. Switching to real root"

# clean up. The init process will remount proc sys and dev later
umount /prov
umount /sys
umount /dev

# switch to the real root and execute init
exec switch_root /mnt/root /sbin/init


See anything wrong?

Cheers,

EE
PS: Within the rescue_shell, I see that /proc /sys and /dev are mounted, which means that the init script has made it at least to that point. I'm assuming the failure is in assembling the / array (/dev/md126), since it doesn't care if /boot fails. Is there any reason to worry about the difference in formatting between how blkid returns /dev/md126, and how it's listed in the init script? ie, that the latter uses colons?
EDIT: I mis-copied the mdadm --assemble line for md126. It's correct now, and matches what blkid returns and is listed in the bootloader.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 29963
Location: 56N 3W

PostPosted: Wed May 02, 2012 8:32 pm    Post subject: Reply with quote

ExecutorElassus,

Distilling what you posred.
Code:
/dev/sda1
UUID  : UUID=707f4cba:12af970b:cb201669:f728008a
Preferred Minor : 1

shows that /dev/sda1 belongs to /dev/md1 and /dev/md1 has UUID=707f4cba:12af970b:cb201669:f728008a
Your init script says
Code:
/sbin/mdadm --assemble /dev/md1 --uuid=8d1b95b6-6e06-48a7-946a-3b739c8ee637

You update 2 shows
Code:
 /dev/md1: UUID="8d1b95b6-6e06-48a7-946a-3b739c8ee637" TYPE="ext2"

Its clear from the above that you are using the UUID of the filesystem on md1 to attempt to assemble md1, not the UUID of the raid.

You have done the same thigs for md126 too, so the initrd will not assemble your raid sets.

Code:
/sbin/vgchange -ay vg || rescue_shell
is your lvm volume group called vg ?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Wed May 02, 2012 8:50 pm    Post subject: Reply with quote

'vg' is indeed the name of my volume group.

okay, so I have the UUIDs wrong in my init (and thus likely also in the bootloader). I can fix the former, but how do I fix the latter? Is there a means from within busybox to remake the initrd, or will I have to use a liveCD?

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 29963
Location: 56N 3W

PostPosted: Wed May 02, 2012 10:57 pm    Post subject: Reply with quote

ExecutorElassus,

You need to use the liveCD to fix the initrd.

The root=uuid= needs to be the uuid of the root filesystem, not that of the underlying raid.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Wed May 02, 2012 11:34 pm    Post subject: Reply with quote

Hi Neddy,

alas! I was afraid you'd say that. It'll have to wait until tomorrow night, when I can spend the time opening up the box and plugging in the old IDE CD drive I keep around for this sole purpose.

Once I've booted into the live CD, I know how to start up the md arrays and mount everything. Do I just edit the init script I have stored to match the correct UUIDs, and then repeat the '/usr/src/linux/scripts/gen_initramfs_list.sh -o /boot/initrd.cpio.gz /root/initrd/initramfs_list ' command from your guide?

so, to be clear:
1) the UUID for md126 in the bootloader is for the filesystem, and is returned by 'blkid'
but
2) the UUID for that same array in the init script is for the raid array, and will thus not match, and is the UUID returned by mdadm -E /dev/sdXn

Is that correct?

Cheers,

EE
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Thu May 03, 2012 9:56 am    Post subject: Reply with quote

Hi Neddy,

so, I updated your script and now it gets past assembling the md arrays, hurrah!

here's my next problem: my mountpoints in the init script are:
Code:
mountpoints="/usr /usr/portage /usr/portage/distfiles /var /var/tmp /home /opt /tmp"

the script checks and mounts /usr fine but on checking /usr/portage, I get an error:
Code:
Checking local filesystem : /usr
/dev/mapper/vg-usr: clean, 748763/1310720 files, 3776000/5242880 blocks
Mounting /usr
kjournald starting. Commit interval 5 seconds
EXT3-fs (dm-0): using internal journal
EXT3-fs (dm-0): mounted filesystem with writeback data mode
Checking local filesystem : /usr/portage
/dev/mapper/vg-portage: clean, 184986/200704 files, 415453/2097152 blocks
Mounting /usr/portage
mount: mounting /dev/dm-1 on /usr/portage failed: No such file or directory
Error while mounting /usr/portage
Somewhing went wrong. Dropping you to a shell


The relevant portion of my /etc/fstab is:
Code:
/etc/fstab:
UUID=7f880ef6-833c-4d19-96fa-524f78e822f8    /usr           ext3         noatime,noauto    1 0
UUID=422e349f-7f3b-4037-9621-1c786e16e48b /usr/portage ext2       noatime,noauto    1 0
UUID=b4335b32-6bcc-44c3-9f85-bf2c91eb400e   /usr/portage/distfiles ext2 noatime, noauto  1 0
which matches the values returned by blkid for those filesystems.

Any suggestions what's going on?

Cheers,

EE

UPDATE: since udev only really cares that /usr and /var are premounted, I dropped the rest from the init script, reverted their lines in /etc/fstab back to allow automounting and checking, and rebooted. hurrah! I have a root prompt, and can emerge stuff!

… and now I'm back to a gui. My beloved has been returned to me! omg omg omg

I'll keep you posted, but it looks like everything is in order. Only wonky thing I saw on boot was some errors about nonexistent /dev/vg nodes for some partitions, but they mounted anyway.

holy crap, this nightmare might finally be over. <3

PS: do I need to re-run the gen_initramfs_list.sh script each time I install a new kernel?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 29963
Location: 56N 3W

PostPosted: Thu May 03, 2012 7:34 pm    Post subject: Reply with quote

ExecutorElassus,

I was about to post to say only mount /usr and /var but went for a beer instead.
When I got back, you had already done it.

As the initrd does not contain anything kernel specific, there is no need to remake it for every kernel.
Indeed, it uses random binaries from your system, thats a good reason not to remake it unless you really need to.

If there are security updates for the packages in the initrd, do you care?
They cannot be exploited remotely as networking is not started until the initrd has done its thing and been discarded.

When you do update it, give the new initrd a new name. You really don't want to overwrite your only working initrd with a broken one, just like your kernel.

All the initrd really does is to appease >=udev-182 by mounting /usr and /var before the real init script starts udev.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
l33t
l33t


Joined: 11 Mar 2004
Posts: 697
Location: Stuttgart, Germany

PostPosted: Fri May 04, 2012 1:25 am    Post subject: Reply with quote

All right, then this will be initrd-1.0. Hurrah for a feature-complete (ie, it boots!) release! Out of Beta and releasing on time, etc etc.

Well, if I could buy you a beer, I would. Thanks for all the (very patient) help you've provided. I'm (mostly) sure the system - at least as far as having a working lvm for recent udev releases - is functional. I'll mark this as solved. (now I just have to figure out why boinc, kdepimlibs, kgpg, and gnome-settings-daemon won't emerge, but that's more a portage/programming question).

Cheers, mate.

EE
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page Previous  1, 2, 3, 4
Page 4 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum