View previous topic :: View next topic |
Author |
Message |
wildbug n00b
Joined: 07 Oct 2007 Posts: 73
|
Posted: Fri Jul 22, 2011 4:50 pm Post subject: md devices being assembled before multipath [SOLVED] |
|
|
I've recently installed a SAS2 disk array, and I'm having some issues bringing it up (correctly) on boot.
The root filesystem is on a RAID1 (motherboard SATA). The new storage array is in a JBOD attached to two LSI HBAs. multipath is used to map the two paths to one device, those multipath devices are assembled into four 9-disk RAID6 volumes which are part of an LVM2 volume group. A single logical volume exists in this volume group and is formatted with XFS.
This works fine when assembled manually but doesn't come up correctly when rebooting. The problem is that the md devices start before the multipath devices are created; AFAICT multipath fails because the devices are already in use by md.
I've turned off md autodetect in the kernel, and I edited the "before" line in /etc/init.d/multipath to include mdraid. The root device is assembled with a "md=127,/dev/sda1,/dev/sdb1" kernel parameter in grub.conf.
You can see that the md devices are already assembled when /etc/init.d/mdraid is run. Here's an excerpt from /var/log/rc.log: Code: | rc boot logging started at Fri Jul 22 19:50:01 2011
* Setting system clock using the hardware clock [UTC] ... [ ok ]
* Loading module dm_multipath ... [ ok ]
* Autoloaded 1 module(s)
* Activating Multipath devices ... [ ok ]
* Starting up RAID devices ...
mdadm: /dev/md10 is already in use.
mdadm: /dev/md11 is already in use.
mdadm: /dev/md12 is already in use.
mdadm: /dev/md/126 is already in use.
[ !! ]
* Setting up the Logical Volume Manager ... [ ok ]
* Checking local filesystems ...
/sbin/fsck.xfs: XFS file system.
/sbin/fsck.xfs: UUID=a5c7abf9-d2bc-4d30-bf08-df08215c48c1 does not exist
/sbin/fsck.xfs: XFS file system.
* Operational error
[ !! ]
(...continues) |
In dmesg I can see "md: bind<sd*>" lines interspersed between the SCSI discoveries. Excerpt: Code: | [ 17.230502] scsi 6:0:26:0: Direct-Access SEAGATE ST32000444SS 0006 PQ: 0 ANSI: 5
[ 17.230508] scsi 6:0:26:0: SSP: handle(0x0025), sas_addr(0x5000c50033f6f6ce), phy(14), device_name(0x00c50050cef6f633)
[ 17.230511] scsi 6:0:26:0: SSP: enclosure_logical_id(0x500304800000007f), slot(6)
[ 17.230515] scsi 6:0:26:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1)
[ 17.231283] sd 6:0:23:0: [sdax] Attached SCSI disk
[ 17.231454] sd 6:0:25:0: [sdaz] Write cache: enabled, read cache: enabled, supports DPO and FUA
[ 17.232243] sday: unknown partition table
[ 17.232835] sd 6:0:26:0: [sdba] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[ 17.233159] sd 6:0:26:0: Attached scsi generic sg54 type 0
[ 17.234782] sd 6:0:26:0: [sdba] Write Protect is off
[ 17.234785] sd 6:0:26:0: [sdba] Mode Sense: d7 00 10 08
[ 17.235817] md: bind<sdi>
|
Later in the dmesg output md assembles its devices despite having autoassemble turned off both in the kernel and on the kernel boot line. What gives?
Last edited by wildbug on Fri Mar 01, 2013 5:39 pm; edited 1 time in total |
|
Back to top |
|
|
wildbug n00b
Joined: 07 Oct 2007 Posts: 73
|
Posted: Fri Jul 22, 2011 5:28 pm Post subject: |
|
|
udev's doing this, isn't it?
Code: | # rc-update show sysinit
devfs | sysinit
dmesg | sysinit
udev | sysinit
# find /lib64/udev -name "*md*"
/lib64/udev/rules.d/64-md-raid.rules |
|
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54236 Location: 56N 3W
|
Posted: Fri Jul 22, 2011 5:34 pm Post subject: |
|
|
wildbug,
Kernel auto assembly works with raid superblocks version 0.90 only. Thats not been the default for about 6 months now.
That change has caused a lot of people who were expecting raid auto assembly to just work issues.
What version raid suberblocsl do you have ?
Try Code: | $ sudo /sbin/mdadm -E /dev/sda1
Password:
/dev/sda1:
Magic : a92b4efc
Version : 0.90.00
UUID : 9392926d:64086e7a:86638283:4138a597
Creation Time : Sat Apr 11 16:34:40 2009
Raid Level : raid1
{snip] | on one of the volumes that you donate to raid. I would expect your version to be 1.2, in which case your raid sets must be being assembled by mdadm somewhere.
Kernel auto assembly looks like this in dmesg
Code: | [ 2.380529] md: Waiting for all devices to be available before autodetect
[ 2.380874] md: If you don't use raid, use raid=noautodetect
[ 2.381485] md: Autodetecting RAID arrays.
[ 2.503208] md: Scanned 12 and added 12 devices.
[ 2.504567] md: autorun ...
[snip] |
_________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
wildbug n00b
Joined: 07 Oct 2007 Posts: 73
|
Posted: Fri Jul 22, 2011 6:03 pm Post subject: |
|
|
Neddy, thanks for the reply, but despite my erroneous reply (and subsequent retraction) in the other thread, I realize that autoassemble only works for 0.90 superblocks. In fact it was my experience described above that made me think that autoassemble was working despite non-0.90 superblocks. However, the root device DOES have v0.90 superblocks and has been autoassembling correctly for months now. But with the recent addition of devices that require multipath to be active before md, I've intentionally turned off autodetect and just added a kernel parameter to assemble the root device (as detailed in my OP).
I'm not trying to get arrays to autoassemble; I'm trying to STOP them from being assembled before multipath is activated. I turned off raid autodetect in the kernel (and redundant "raid=noautodetect", just in case); non-root arrays are being assembled between kernel boot and the boot runlevel, which is why I'm now wondering if udev is responsible.
(For the record, my root device is 0.90 and the other md devices are a mixture of 1.1 and 1.2. Not that it's relevant.)
Here's the root device being correctly assembled (from dmesg): Code: | [ 6.797329] md: Skipping autodetection of RAID arrays. (raid=autodetect will force)
[ 6.797975] md: Loading md127: /dev/sda1
[ 6.798894] md: bind<sda1>
[ 6.799383] md: bind<sdb1>
[ 6.800027] bio: create slab <bio-1> at 1
[ 6.800490] md/raid1:md127: active with 2 out of 2 mirrors
[ 6.800847] md127: detected capacity change from 0 to 224063848448
[ 6.801501] md127: unknown partition table
[ 6.803088] md127: unknown partition table
[ 6.829729] XFS (md127): Mounting Filesystem
[ 6.845067] usb 3-1: new low speed USB device number 2 using ohci_hcd
[ 6.859377] XFS (md127): Ending clean mount
[ 6.859741] VFS: Mounted root (xfs filesystem) readonly on device 9:127.
|
|
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54236 Location: 56N 3W
|
Posted: Fri Jul 22, 2011 7:15 pm Post subject: |
|
|
wildbug,
We have established that its not the kernel doing auto assemble of your raid 1.1/1.2 raid sets, which is a step in the right direction.
What does produce ?
mdadm should not be listed or it will be started in its sequence by the start up scripts.
Just because a service is not listed in rc-update show does not mean it is not running. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
wildbug n00b
Joined: 07 Oct 2007 Posts: 73
|
Posted: Fri Jul 22, 2011 7:43 pm Post subject: |
|
|
Code: | # rc-update show
bootmisc | boot
consolefont | boot
consolekit | default
cupsd | default
dbus | default
devfs | sysinit
device-mapper | boot
dmesg | sysinit
fsck | boot
hostname | boot
hwclock | boot
keymaps | boot
killprocs | shutdown
local | default nonetwork
localmount | boot
lvm | boot
mdadm | default
mdraid | boot
modules | boot
mount-ro | shutdown
mtab | boot
multipath | boot
multipathd | default
net.eth0 | default
net.eth1 | default
net.lo | boot
netmount | default
nfs | default
nfsmount | default
ntpd | boot
pbs_mom | default
pbs_sched | default
pbs_server | default
procfs | boot
root | boot
savecache | shutdown
sshd | default
swap | boot
sysctl | boot
termencoding | boot
udev | sysinit
udev-postmount | default
upsd | default
upsdrv | default
upsmon | default
urandom | boot
vgl | default
xdm | default
xinetd | default |
FYI, /etc/init.d/mdadm is the monitoring daemon; there is no assembly. /etc/init.d/mdraid is the one containing "mdadm -As".
But look at my OP again, specifically the first code block. That's output from the rc_logger for the boot runlevel. You can see the order of services being executed -- hwclock, modules, multipath, mdraid, etc. When it hits mdraid, it declares that the md devices are already assembled. That means that something is putting them together AFTER kernel autodetect and BEFORE mdraid.
When I first posted this, I didn't realize there was a sysinit runlevel before boot. There are three services in sysinit -- devfs, dmesg, and udev. It's possible that udev or devfs is triggering the md devices (which is what I was getting at in my second post). I'm not intimately familiar with either of those. |
|
Back to top |
|
|
wildbug n00b
Joined: 07 Oct 2007 Posts: 73
|
Posted: Fri Jul 22, 2011 8:19 pm Post subject: |
|
|
Here's my complete dmesg: http://pastebin.com/raw.php?i=QGq7wit4
Here's an excerpt of the array assembly timeline: Code: | [ 6.778542] md/raid1:md127: active with 2 out of 2 mirrors
[ 7.618775] udev[2822]: starting version 164
[ 7.772809] md/raid1:md126: active with 2 out of 2 mirrors
[ 8.282715] md/raid:md10: raid level 6 active with 9 out of 9 devices, algorithm 2
[ 8.401377] md/raid:md11: raid level 6 active with 9 out of 9 devices, algorithm 2
[ 17.420690] md/raid:md125: raid level 5 active with 6 out of 6 devices, algorithm 2
[ 17.845425] md/raid:md12: raid level 6 active with 9 out of 9 devices, algorithm 2
[ 18.108825] md/raid:md13: raid level 6 active with 9 out of 9 devices, algorithm 2
[ 20.816985] device-mapper: multipath: version 1.3.0 loaded
[ 21.021549] device-mapper: table: 253:0: multipath: error getting device |
|
|
Back to top |
|
|
wildbug n00b
Joined: 07 Oct 2007 Posts: 73
|
Posted: Mon Jul 25, 2011 8:05 pm Post subject: |
|
|
Yep, udev is the culprit. The rule /lib/udev/rules.d/64-md-raid.rules (supplied by sys-fs/mdadm) calls "mdadm --incremental" on the device. If I remove that file, md arrays are not automatically assembled.
Now I have to figure out how to fix this in an upgrade-friendly way. I'd like to make that rule ignore disks attached via the HBAs. Could this be possible by creating a custom rule in /etc/udev/rules.d and without deleting/editing the "official" /lib/udev/rules.d/64-md-raid.rules? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54236 Location: 56N 3W
|
Posted: Mon Jul 25, 2011 8:25 pm Post subject: |
|
|
wildbug,
If you create a rule in a file with a lower number than /lib/udev/rules.d/64-md-raid.rules ? /not etc?... ?
Say 03-md-raid.rules, that does nothing, it will be run before 64-md-raid.rules and will not be affected by updates either.
It must match the same thing(s) as in 64-md-raid.rules.
udev will trigger, execute your rule that does nothing, then you have full manual control. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
wildbug n00b
Joined: 07 Oct 2007 Posts: 73
|
Posted: Mon Jul 25, 2011 10:26 pm Post subject: |
|
|
I think I've solved this. I can't reboot to test right now as I currently have users running some long simulations, but udevadm test seems to produce correct results. I'll mark the thread as solved once I can reboot and confirm.
This is what I did:
The server in question has two LSI 9200-8e HBAs and an onboard SAS controller with the same chipset (LSI2008). As I only want to include devices connected to the HBAs, I used "udevadm info" to find differences between the device trees of drives attached to the motherboard and the HBAs. At one level I found differences -- ATTRS{subsystem_vendor} and ATTRS{subsystem_device}. I could now identify the correct devices.
The next part was overriding the array assembly. I finally realized that there was only one line in 64-md-raid.rules that I had to circumvent:
Code: | ENV{ID_FS_TYPE}=="linux_raid_member", ACTION=="add", RUN+="/sbin/mdadm --incremental $env{DEVNAME}" |
If either ENV{ID_FS_TYPE} or ACTION fail to match, then mdadm isn't executed. ENV variables are settable, so I created my own rule to unset ENV{ID_FS_TYPE} just before 64-md-raid.rules is consulted. It also has to be after /lib/udev/rules.d/60-persistent-storage.rules as this is where that variable is originally set (as identified from udevadm test output).
Code: |
# /etc/udev/rules.d/63-lsi-9200-8e.rules
KERNEL=="sd*", ACTION=="add", DRIVERS=="mpt2sas", ATTRS{subsystem_vendor}=="0x1000", ATTRS{subsystem_device}=="0x3080", ENV{ID_FS_TYPE}="" |
Using udevadm test on devices not attached to the HBAs, I can see the "mdadm --incremental" line; HBA-attached devices no longer include it. That should mean success. |
|
Back to top |
|
|
dmitryilyin n00b
Joined: 08 Apr 2008 Posts: 27 Location: Netherlands
|
Posted: Tue Jul 26, 2011 7:09 pm Post subject: |
|
|
You'll have more luck with better server distribution (assuming you are working with server) Debian or RedHat like.
They use advanced initramfs and much better suited for production servers.
Gentoo is good for learning linux, development and experimenting) |
|
Back to top |
|
|
wildbug n00b
Joined: 07 Oct 2007 Posts: 73
|
Posted: Fri Mar 01, 2013 5:38 pm Post subject: |
|
|
So I finally got around to rebooting...
I think I have this sorted out. Custom udev rules are not necessary.
What happens is that udev runs in the sysinit bootlevel; /lib/udev/rules.d/64-md-raid.rules is part of this process, and it uses mdadm in incremental mode to attempt to automatically assemble RAID devices as components are discovered. However, it does respect /etc/mdadm.conf, so that file can be used to control behavior during this step. Whitelisting devices using a DEVICE line wasn't sufficent; it seems only ARRAY lines are affected by this. An AUTO line set to blacklist all arrays in conjunction with selectively whitelisting arrays in ARRAY lines with "auto=yes" will work. Setting "devices=/dev/mapper/*" in the ARRAY lines was also necessary.
/etc/mdadm.conf
Code: | AUTO -all
DEVICE /dev/sd[ab]*
DEVICE /dev/mapper/*
ARRAY /dev/md10 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx devices=/dev/mapper/* spares=1 spare-group=mp_spares
ARRAY /dev/md11 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx devices=/dev/mapper/* spares=1 spare-group=mp_spares
ARRAY /dev/md/root UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx auto=yes |
The dm-multipath kernel module must be loaded (if you've built it as a module) or /etc/init.d/multipath will fail to start.
/etc/conf.d/modules
Code: | modules="dm_multipath" |
And mdraid, which will be used to bring up the arrays listed in mdadm.conf, needs to be started after multipath.
/etc/conf.d/mdraid
Code: | rc_need="multipath" |
|
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|