dbishop Tux's lil' helper

Joined: 08 Dec 2007 Posts: 107
|
Posted: Mon Jul 02, 2012 11:30 pm Post subject: Hardware RAID5 volume rescanning problem |
|
|
Hi All,
I have an unusual setup and requirement to be able to hotswap RAID5 volumes by plugging in already-formatted and known-good external multidisk JBODs via SAS SFF-8088 physical connections -- without rebooting each time. I have several JBOD boxes that have hardware RAID5 sets in them, all created via Areca 1882X cards, and formatted to NTFS (unfortunately it's "a must").
When I hotplug the JBOD's SAS interface into the 1882X for the first time after a fresh boot, whichever RAID5 volume is present gets recognized successfully and can be successfully r/w mounted. (I am using ntfs-3g and FUSE.) I can mount/umount the file system at-will, and I can even unplug the SAS cable too -- provided that I plug in the exact same RAID5 volume set.
And hence the issue: If I unplug the SAS cable and insert a new JBOD box with another RAID5 volume set, the swap-out is recognized by Linux (gentoo-sources 3.3.8 kernel) BUT when I try to mount the file system I get complaints that the RAID5 volume's NTFS signature is missing, and the mount command fails. If I simply reboot and try the volume that just gave the 'missing NTFS signature' error, it mounts just fine and r/w is without incident. Doesn't matter which volume I start with or end with -- I have four such RAID5's I need to hotplug -- it always fails the same way. If I return to the RAID5 volume I used first after (re)boot it will sometimes work again, but who knows why or why not, because I do this after some degree of experimentation to get the desired swapping to work.
I have tried just about every way to rescan the scsi bus (which itself works to a degree -- the various tools cause the proper volume name to show up). lsscsi seems to report normally, partprobe doesn't fix anything (reports no errors either), rescan-scsi-bus command doesn't help, and so on. Even various command line tricks like this are ineffective: Code: | for i in $(ls /sys/class/scsi_host/); do echo "0 0 0" > /sys/class/scsi_host/$i/scan; done |
It's really problematic that I have to reboot just to change SAS RAID5 volumes, especially since I have to unplug the SAS cable each time (boot mayhem ensues if I don't). And heaven forbid if I plug the wrong one in after booting...
It seems that there is some genetic baggage that Linux is remembering somewhere and SCSI rescans are not clearing whatever it is out, nor is partprobe. And whatever its, it seems a consequence of 'mount' rather than what the scsi bus is doing, since it's after the first successful 'mount' that things break. (There are no fstab entries that correspond to the RAID5's.) FUSE is compiied as a module, and it loads on first mount. rmmod'ing the fuse module leads to disaster.
Any ideas?
As an aside, I have to plug in the JBOD RADI5's after POST because otherwise I get a kernel panic on boot because grub (or something) tries to find the rootfs on the RAID card. I have not found any combination of BIOS boot order settings that will prevent this from happening. Also this BIOS does not seem to have the option to disable the boot firmware loading on the RAID card, nor does the 1882X seem to have any place to tell it to not permit booting from itself. Any ideas on this would be appreciated too.
Always grateful for any help. |
|