ahferroin7 n00b
Joined: 26 Nov 2014 Posts: 10
|
Posted: Wed Jul 06, 2016 2:56 pm Post subject: Issues with device-mapper in initramfs during boot. |
|
|
For the past few months, I've been having persistent issues with something related to the device mapper during boot. At some point early during early boot in the initramfs (I've narrowed this down to some time between udev starting and it finishing processing all the events), sometimes a couple of dm-linear mappings spontaneously appear mapping whole disks to devices of the form /dev/mapper/${DISK_MODEL}_${DISK_SERIAL}, where disk model and disk serial are exactly as reported by smartctl and similar tools. Because the device-mapper module correctly marks the devices as in-use, everything else refuses to touch them, which means that LVM processing of the PV's on these disks fails, and a bunch of LV's come up degraded or non-existent. In turn, this means that my home server system always requires manual intervention to complete boot up, and my laptop sometimes refuses to boot (it has one disk, so when this happens, it hangs during boot).
A bit of further background on the systems I'm seeing this on:
* I'm running a semi-custom kernel based on the mainline stable kernel (sources can be found here: http://github.com/Ferroin/linux, it's currently about 6 functional patches on top of mainline 4.6.3). I've not tested with a Gentoo kernel, but I have tested plain upstream kernels (without my patches) and still see this happening. I"m pretty convinced right now that the kernel itself has nothing to do with this, as booting without an initramfs never causes this to happen, and it happens with the exact same frequency regardless of kernel version.
* I'm using Dracut for the initramfs generation (which is indirectly why I've been able to figure out the time window in which this happens).
* I'm using regular sysv-init and OpenRC for init.
* I'm using a current ~amd64 version of eudev.
* Both systems have / on an LVM volume, and I'm not using the device mapper outside of LVM (dmtaab is empty, both on / and in the initramfs. I still see this issue though if I boot to a recovery partition that's not on LVM and tell the initramfs to not do any processing of LVM, DM, LUKS and MD devices.
One of the two systems this is happening on is a laptop with a single SATA connected SSD, the other is a custom built home server system with 2 SSD's and 4 HDD's, all on a single SATA controller. As noted above, I've figured out (via various rd.break options) that these are getting created after udev starts, but before the post udev-trigger stuff starts (they're not present when inspecting from rd.break=pre-udev, sometimes appear when inspecting from rd.break=pre-trigger, and always appear when inspecting from rd.break=pre-initqueue). Which disks get mapped seems to be random, but on the home server appears to favor the HDD's over the SSD's (and it happens on the laptop with about the same frequency as the SSD's on the server).
I think there's a race condition somewhere that's making it more common with HDD's than SSD's, but I still don't have any idea what the root cause of this is. I'm hoping someone else might have some idea what is going on here. I'm happy to provide more info if needed. |
|