View previous topic :: View next topic |
Author |
Message |
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Fri Dec 07, 2012 8:20 am Post subject: RAID6: emergency help? |
|
|
I created a raid6 array from six drives and copied all my old data onto it. I still have the old drives but ... setting them up and copying them over would be a rather large task.
The first time I booted the machine with the new raid6 array it worked perfectly.
Then I powered it off and put the cover back on and rebooted and one of the drives became faulty.
Then trying to re-add it to the array to get it to rebuild another drive went faulty.
Then one more mdadm -A and another drive went away and it looks like all data is lost.
What am I doing wrong?
Here's what mdadm currently says:
Code: | server ~ # mdadm --detail /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Array Size : 11720534016 (11177.57 GiB 12001.83 GB)
Used Dev Size : 2930133504 (2794.39 GiB 3000.46 GB)
Raid Devices : 6
Total Devices : 5
Persistence : Superblock is persistent
Update Time : Fri Dec 7 00:10:39 2012
State : clean, FAILED
Active Devices : 3
Working Devices : 4
Failed Devices : 1
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Name : SparePC:soulstorage
UUID : bfc07787:5075d763:c70b65ac:687f7544
Events : 61
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
2 8 49 2 active sync /dev/sdd1
3 0 0 3 removed
4 0 0 4 removed
5 0 0 5 removed
3 8 97 - faulty spare /dev/sdg1
6 8 81 - spare /dev/sdf1
|
Here's what it said the previous failure:
Code: | server ~ # mdadm --detail /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Used Dev Size : -1
Raid Devices : 6
Total Devices : 5
Persistence : Superblock is persistent
Update Time : Fri Dec 7 00:04:04 2012
State : active, degraded, Not Started
Active Devices : 4
Working Devices : 5
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Name : SparePC:soulstorage
UUID : bfc07787:5075d763:c70b65ac:687f7544
Events : 56
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
2 8 49 2 active sync /dev/sdd1
3 8 97 3 active sync /dev/sdg1
4 0 0 4 removed
5 0 0 5 removed
6 8 81 - spare /dev/sdf1
|
It is interesting to note that the failed drives are all connected to a raid (with hardware disabled) card that had, until now been working fine.
Can someone help me? I have no idea what's failing or why...
Update: It looks like they are hard errors....
Code: | sd 9:0:0:0: [sdh]
sd 9:0:0:0: [sdh]
sd 9:0:0:0: [sdh]
sd 9:0:0:0: [sdh] CDB:
end_request: I/O error, dev sdh, sector 264200
sd 9:0:0:0: [sdh] Unhandled sense code
sd 9:0:0:0: [sdh]
sd 9:0:0:0: [sdh]
sd 9:0:0:0: [sdh]
sd 9:0:0:0: [sdh] CDB:
end_request: I/O error, dev sdh, sector 265224
sd 8:0:0:0: [sdg]
sd 8:0:0:0: [sdg]
sd 8:0:0:0: [sdg]
sd 8:0:0:0: [sdg] CDB:
end_request: I/O error, dev sdg, sector 264200
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:1, dev:sdg1
disk 4, o:1, dev:sdh1
disk 5, o:1, dev:sdf1
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:1, dev:sdg1
disk 4, o:1, dev:sdh1
sd 9:0:0:0: [sdh]
sd 9:0:0:0: [sdh]
sd 9:0:0:0: [sdh]
sd 9:0:0:0: [sdh] CDB:
end_request: I/O error, dev sdh, sector 265224
md/raid:md127: read error NOT corrected!! (sector 263176 on sdh1).
md/raid:md127: Disk failure on sdh1, disabling device.
md/raid:md127: read error not correctable (sector 263184 on sdh1).
md/raid:md127: read error not correctable (sector 263192 on sdh1).
md/raid:md127: read error not correctable (sector 263200 on sdh1).
md/raid:md127: read error not correctable (sector 263208 on sdh1).
md/raid:md127: read error not correctable (sector 263216 on sdh1).
md/raid:md127: read error not correctable (sector 263224 on sdh1).
md/raid:md127: read error not correctable (sector 263232 on sdh1).
md/raid:md127: read error not correctable (sector 263240 on sdh1).
md/raid:md127: read error not correctable (sector 263248 on sdh1).
md/raid:md127: read error not correctable (sector 263256 on sdh1).
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:1, dev:sdg1
disk 4, o:0, dev:sdh1
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:1, dev:sdg1
nfsd: last server has exited, flushing export cache
md: unbind<sdf1>
md: export_rdev(sdf1)
md: unbind<sdg1>
md: export_rdev(sdg1)
md: unbind<sdh1>
md: export_rdev(sdh1)
md: unbind<sdc1>
md: export_rdev(sdc1)
md: unbind<sdd1>
md: export_rdev(sdd1)
md: unbind<sdb1>
md: export_rdev(sdb1)
md: bind<sdc1>
md: bind<sdd1>
md: bind<sdg1>
md: bind<sdh1>
md: bind<sdf1>
md: bind<sdb1>
md: kicking non-fresh sdh1 from array!
md: unbind<sdh1>
md: export_rdev(sdh1)
md/raid:md127: device sdb1 operational as raid disk 0
md/raid:md127: device sdg1 operational as raid disk 3
md/raid:md127: device sdd1 operational as raid disk 2
md/raid:md127: device sdc1 operational as raid disk 1
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:1, dev:sdg1
md: unbind<sdb1>
md: export_rdev(sdb1)
md: unbind<sdf1>
md: export_rdev(sdf1)
md: unbind<sdg1>
md: export_rdev(sdg1)
md: unbind<sdd1>
md: export_rdev(sdd1)
md: unbind<sdc1>
md: export_rdev(sdc1)
md: bind<sdc1>
md: bind<sdd1>
md: bind<sdg1>
md: bind<sdh1>
md: bind<sdf1>
md: bind<sdb1>
md: unbind<sdb1>
md: export_rdev(sdb1)
md: unbind<sdf1>
md: export_rdev(sdf1)
md: unbind<sdh1>
md: export_rdev(sdh1)
md: unbind<sdg1>
md: export_rdev(sdg1)
md: unbind<sdd1>
md: export_rdev(sdd1)
md: unbind<sdc1>
md: export_rdev(sdc1)
md: bind<sdc1>
md: bind<sdd1>
md: bind<sdg1>
md: bind<sdh1>
md: bind<sdf1>
md: bind<sdb1>
md: kicking non-fresh sdh1 from array!
md: unbind<sdh1>
md: export_rdev(sdh1)
md/raid:md127: device sdb1 operational as raid disk 0
md/raid:md127: device sdg1 operational as raid disk 3
md/raid:md127: device sdd1 operational as raid disk 2
md/raid:md127: device sdc1 operational as raid disk 1
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:1, dev:sdg1
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:1, dev:sdg1
disk 4, o:1, dev:sdf1
sd 8:0:0:0: [sdg]
sd 8:0:0:0: [sdg]
sd 8:0:0:0: [sdg]
sd 8:0:0:0: [sdg] CDB:
end_request: I/O error, dev sdg, sector 264192
md/raid:md127: read error not correctable (sector 262144 on sdg1).
md/raid:md127: Disk failure on sdg1, disabling device.
md/raid:md127: read error not correctable (sector 262152 on sdg1).
md/raid:md127: read error not correctable (sector 262160 on sdg1).
md/raid:md127: read error not correctable (sector 262168 on sdg1).
md/raid:md127: read error not correctable (sector 262176 on sdg1).
md/raid:md127: read error not correctable (sector 262184 on sdg1).
md/raid:md127: read error not correctable (sector 262192 on sdg1).
md/raid:md127: read error not correctable (sector 262200 on sdg1).
md/raid:md127: read error not correctable (sector 262208 on sdg1).
md/raid:md127: read error not correctable (sector 262216 on sdg1).
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:0, dev:sdg1
disk 4, o:1, dev:sdf1
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:0, dev:sdg1
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
disk 3, o:0, dev:sdg1
disk 0, o:1, dev:sdb1
disk 1, o:1, dev:sdc1
disk 2, o:1, dev:sdd1
md: unbind<sdb1>
md: export_rdev(sdb1)
md: unbind<sdf1>
md: export_rdev(sdf1)
md: unbind<sdg1>
md: export_rdev(sdg1)
md: unbind<sdd1>
md: export_rdev(sdd1)
md: unbind<sdc1>
md: export_rdev(sdc1)
md: bind<sdc1>
md: bind<sdd1>
md: bind<sdg1>
md: bind<sdh1>
md: bind<sdf1>
md: bind<sdb1>
md: unbind<sdb1>
md: export_rdev(sdb1)
md: unbind<sdf1>
md: export_rdev(sdf1)
md: unbind<sdh1>
md: export_rdev(sdh1)
md: unbind<sdg1>
md: export_rdev(sdg1)
md: unbind<sdd1>
md: export_rdev(sdd1)
md: unbind<sdc1>
md: export_rdev(sdc1)
|
_________________ Some day there will only be free software. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54258 Location: 56N 3W
|
Posted: Fri Dec 07, 2012 10:05 pm Post subject: |
|
|
RayDude,
Don't do anything that may involve writes. Post the output of Code: | mdadm -E /dev/sd[abcdef]1 |
What you hope to find is four members of the set with the same event count so you can assemble the raid in degraded mode.
I've just been through this with my 4 spindle raid5.
Its also worth installing smartmontools and looking at the drives internal error log.
If you saved dmesg with the error reports that showed why the drives were kicked out of the array, that would be good too. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Sat Dec 08, 2012 12:51 am Post subject: |
|
|
Thanks Neddy!
I think the raid controller I used for my external SATA box is not compatible with these drives... Because the three drives that failed are all attached to it. (two inside, one outside)
It looks like b, c, d, and f are all at the same count which means I might be able to recover this. I bought two new dual controllers (DGMS, Fry's sucks) to see if they will work with the drives. I think b, c, and d are okay because they are plugged into the mother board.
It would be so awesome if I could get the array to rebuild itself. It takes days to copy 8 TB over gigabit.
Here's what mdadm said
Code: | server ~ # mdadm -E /dev/sd[bcdfgh]1
/dev/sdb1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 6b6e4f45:5d52d5ea:c6ff4d3a:ebf8b515
Update Time : Fri Dec 7 00:10:47 2012
Checksum : ed444042 - correct
Events : 63
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AAA... ('A' == active, '.' == missing)
/dev/sdc1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : dc35e656:81f9e617:e9a6eafe:00cf70d6
Update Time : Fri Dec 7 00:10:47 2012
Checksum : b1442c11 - correct
Events : 63
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AAA... ('A' == active, '.' == missing)
/dev/sdd1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 59a13e07:c83416e1:4c96063b:6ca6bbb3
Update Time : Fri Dec 7 00:10:47 2012
Checksum : 443299a5 - correct
Events : 63
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 2
Array State : AAA... ('A' == active, '.' == missing)
/dev/sdf1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 374073a6:b52ff21f:01661c22:cc5f2acb
Update Time : Fri Dec 7 00:10:47 2012
Checksum : 20c7bc89 - correct
Events : 63
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : AAA... ('A' == active, '.' == missing)
/dev/sdg1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 0445633e:5bf226e9:50f860b0:4784965b
Update Time : Fri Dec 7 00:10:36 2012
Checksum : a0a13abe - correct
Events : 57
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 3
Array State : AAAAA. ('A' == active, '.' == missing)
/dev/sdh1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : active
Device UUID : 697e05c1:644d35c8:1cb2b136:97a14f39
Update Time : Thu Dec 6 23:58:39 2012
Checksum : 665ba374 - correct
Events : 48
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 4
Array State : AAAAA. ('A' == active, '.' == missing)
|
_________________ Some day there will only be free software. |
|
Back to top |
|
|
frostschutz Advocate
Joined: 22 Feb 2005 Posts: 2977 Location: Germany
|
Posted: Sat Dec 08, 2012 1:37 am Post subject: |
|
|
If you have solved the drive failure problem (by hooking them up through some other card). And if the drives do work reliably then. You should be able to reassemble the RAID. By your output, the first four ones are good (same timestamp and event count), whereas the latter two are out-of-date. So you should assemble (using --force if you must) with only the first four drives, and then once this is up and running, re-add the other two. Since RAID 6 allows two drive failures, it will resync. As long as the first drives aren't bad, the sync should succeed and you are back in the game with no additional data loss since the md failure.
Good luck.
If it does not work out you may have to resort to your backup after all. |
|
Back to top |
|
|
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Sat Dec 08, 2012 5:40 am Post subject: |
|
|
Thanks!
I removed the old sata card and added the two new SIL3132 boards. Unfortunately only one of them is recognized and I don't know why. The one plugged into the X16 port is not initialized, no bios no nothing. So I can only see five drives.
I had a four port PCI SATA raid card in my hand but I couldn't remember if this MOBO had a PCI slot so I got the PCIe cards...
Now I'm stuck buying a four port card from NewEgg because Frys didn't have any four port PCIe in stock (PCI is probably slow anyway) or I just have to bite the bullet and assemble with the four good drives and hope that there are no write errors until I find a way to hook up the sixth drive.
Man I wish I'd planned this better. I forgot this MOBO only had four SATA ports.
Update: Well, I let it rebuild the drives and it went to active sync.
Then I added the last drive and its currently rebuilding. So I'll have one redundant drive until I can find a solution that gives me four more SATA devices.
Any suggestions for a good card that's less than a hundred?
Thanks again guys! _________________ Some day there will only be free software. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54258 Location: 56N 3W
|
Posted: Sat Dec 08, 2012 5:45 pm Post subject: |
|
|
RayDude,
Write errors are actually fairly safe. The drive will realise the write failed and reallocate the failed sector.
Its read errors that are the problem. When a drive has problems with a read but its still successful, the data will be moved to s spare sector.
When a read fails, the data is lost and can't be moved. That's a fairly simplistic explaination anyway. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Mad Merlin Veteran
Joined: 09 May 2005 Posts: 1155
|
Posted: Mon Dec 10, 2012 3:46 am Post subject: |
|
|
RayDude wrote: | Any suggestions for a good card that's less than a hundred? |
As far as I know, such a thing doesn't exist. You can grab an LSI 9211-4i for ~$200 or an 9211-8i for ~$250, which are 4 port (1x SFF-8087) and 8 port (2x SFF-8087), respectively. These are barebones HBA cards meant for passing through the disks to the host OS rather than doing any RAID themselves and work well.
You can often find rebrands and/or used cards for less, have a look at this list, the 9211 uses the SAS2008 chipset.
I can't really recommend anything less expensive than that, I've tried a couple of them and have been burned more than once. _________________ Game! - Where the stick is mightier than the sword! |
|
Back to top |
|
|
frostschutz Advocate
Joined: 22 Feb 2005 Posts: 2977 Location: Germany
|
Posted: Mon Dec 10, 2012 1:24 pm Post subject: |
|
|
I'm not sure how expensive they are in $, but Lian Li IB-01 or Dawicontrol DC-624e, are around 80€, so shouldn't be >$100.
The Lian Li is just a port multiplier, though, and the Dawicontrol may need some extras ( http://theangryangel.co.uk/blog/marvell-88se9172-sata3-under-linux-as-of-320 but it may work out of the box in newer kernels maybe )
Since you mentioned you used internal and external ports on your card, there are lots of cards where the external port is shared, i.e. you can use either the external or the internal connector but not both at the same time. |
|
Back to top |
|
|
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Thu Dec 13, 2012 12:44 am Post subject: |
|
|
mdadm help again. I bought a cheap Marvell Based RAID III card from Newegg.
It worked from the start and drive six of the array started rebuilding.
At some point after about 20 hours the new controller died and took three hard drives with it ... again...
I guess the old adage, you get what you paid for, applies here.
Anywho. I bought a PCI raid card SIL3124 based, and its up and running. All six drives read as clean but three of them have a smaller event counts. How do I rebuild this with minimal damage. Is it possible?
Code: | /dev/sdb1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 6b6e4f45:5d52d5ea:c6ff4d3a:ebf8b515
Update Time : Wed Dec 12 16:23:17 2012
Checksum : ed4c7e51 - correct
Events : 49889
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : AAA... ('A' == active, '.' == missing)
/dev/sdc1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : dc35e656:81f9e617:e9a6eafe:00cf70d6
Update Time : Wed Dec 12 16:23:17 2012
Checksum : b14c6a20 - correct
Events : 49889
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 1
Array State : AAA... ('A' == active, '.' == missing)
/dev/sdd1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 59a13e07:c83416e1:4c96063b:6ca6bbb3
Update Time : Wed Dec 12 16:23:17 2012
Checksum : 443ad7b4 - correct
Events : 49889
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 2
Array State : AAA... ('A' == active, '.' == missing)
/dev/sde1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860268032 (2794.39 GiB 3000.46 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 0445633e:5bf226e9:50f860b0:4784965b
Update Time : Wed Dec 12 12:34:15 2012
Checksum : a0b11323 - correct
Events : 49876
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 3
Array State : AAAAAA ('A' == active, '.' == missing)
/dev/sdf1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x2
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
Recovery Offset : 4761467048 sectors
State : clean
Device UUID : b74d0c97:082efe8c:f85ebd68:2e5d8734
Update Time : Wed Dec 12 12:34:15 2012
Checksum : 4a4bffb9 - correct
Events : 49876
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 5
Array State : AAAAAA ('A' == active, '.' == missing)
/dev/sdg1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : bfc07787:5075d763:c70b65ac:687f7544
Name : SparePC:soulstorage
Creation Time : Thu Nov 29 09:03:33 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
Array Size : 23441068032 (11177.57 GiB 12001.83 GB)
Used Dev Size : 5860267008 (2794.39 GiB 3000.46 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 45527180:c644c76d:c1b4c0f2:d2f7ed4a
Update Time : Wed Dec 12 12:34:15 2012
Checksum : 9915d2c6 - correct
Events : 49876
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 4
Array State : AAAAAA ('A' == active, '.' == missing) |
_________________ Some day there will only be free software. |
|
Back to top |
|
|
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Thu Dec 13, 2012 5:31 am Post subject: |
|
|
Update: I just typed 'mdadm -A --force /dev/md127' and it assembled and began rebuilding disk 6 again.
Then I ran fsck.ext4 on /dev/md127 and let the delete a few bad inodes.
Unfortunately, as I suspected, the PCI card is too slow and the rebuild of drive 6 hasn't moved a percent in several hours.
I'm still without a solution. _________________ Some day there will only be free software. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54258 Location: 56N 3W
|
Posted: Thu Dec 13, 2012 9:41 pm Post subject: |
|
|
RayDude,
If all your raid6 was doing was rebuilding - i.e. you were not writing anything to it, you might be lucky.
If there were files open for writing when your three drives went offline, you can expect those files to be in a mess.
If directory writes were in progress, the contents of these directories may be lost.
This wiki article works. The advantage with --create in degraded mode over --force, is that you can try all the combinatons to see if one degraded combination is better than another.
At first sight, there is no reason why it should be but the drives are not all written concurrently, so a failure such as you had will leave the drives in slightly different states. Degraded mode for you means four out of six drives. You need to think carefully before your mount the filesystem, even read only, as journal replays and the resulting writes will still happen. I think you can avoid the journal replay is you want but I don't know how. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
frostschutz Advocate
Joined: 22 Feb 2005 Posts: 2977 Location: Germany
|
Posted: Thu Dec 13, 2012 10:17 pm Post subject: |
|
|
--create is also dangerous though, if you get the command wrong that's bye bye to your data. Your top priority is your hardware issue though. It just doesn't do having three drives vanish in one go. If you have so many controllers failing maybe you should check for short circuits in your PSU/case, or any other cables for that matter. I've used cheap controllers myself and never had a failure. So your problems seem fishy to me somehow. |
|
Back to top |
|
|
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Fri Dec 14, 2012 3:51 am Post subject: |
|
|
Thanks guys.
There were only three bad inodes after the fsck, so the array seems fine.
With the PCI card, its still rebuilding the final drive. It is at 94% and will hopefully be done before the morning...
I have ordered an LSI logic RAID card from Amazon, it will arrive tomorrow. Hopefully it will be reliable in my mother board. Since I'll likely be able to boot my SSD off it, I'll connect four drives to the mother board that way if I have problems the most I will lose is two drives.
This sure has been an experience though, wow. _________________ Some day there will only be free software. |
|
Back to top |
|
|
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Sat Dec 15, 2012 7:19 am Post subject: |
|
|
Just Venting...
I bought a SAS controller from LSI.
It didn't appear to support 3TB drives, so I upgraded the FW from a kubuntu boot usb drive.
It still doesn't appear to support 3TB drives.
What do I have to do to get a working four port SATA card?
*exasperated* _________________ Some day there will only be free software. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54258 Location: 56N 3W
|
Posted: Sat Dec 15, 2012 1:15 pm Post subject: |
|
|
RayDude,
How do you mean RayDude wrote: | didn't appear to support 3TB drives | ?
What happens when you connect a 3G drive.
? The only difference is 48bit LBA or not. Or not means you max out at 137G.
A lot of bolt on goodies claim a 2TB limit so Windows users with MSDOS partition tables are not surprised when they find a 2Tb limit but its not hardware related.
Try it, I will be surprised if it doesn't 'just work'. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Sun Dec 16, 2012 4:30 am Post subject: |
|
|
Thanks Neddy.
I already have working GPT 3 TB partitions on the drives. When I connect them to the LSI card, the are reported as 2048 MB and the GPT partitions are not present when the machine is booted.
I've found some forums complaining about this problem with LSI but I haven't found any solutions. I've emailed tech support.
I should point out that I removed the BDROM from the PCI raid card and the raid performance doubled to 135 MB / second... Pretty interesting.
NeddySeagoon wrote: | RayDude,
How do you mean RayDude wrote: | didn't appear to support 3TB drives | ?
What happens when you connect a 3G drive.
? The only difference is 48bit LBA or not. Or not means you max out at 137G.
A lot of bolt on goodies claim a 2TB limit so Windows users with MSDOS partition tables are not surprised when they find a 2Tb limit but its not hardware related.
Try it, I will be surprised if it doesn't 'just work'. |
_________________ Some day there will only be free software. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54258 Location: 56N 3W
|
Posted: Sun Dec 16, 2012 2:22 pm Post subject: |
|
|
RayDude,
GPT uses two copies of the partition table, one at the start of the drive and one at the end. It gets really upset if the two don't match.
With a 2048Mb limit, the copy at the end of the drive can't be read.
dmesg probably has errors about attempting to read beyond the end of the device. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Sun Dec 16, 2012 4:00 pm Post subject: |
|
|
Thanks, I'm sure that's why the partition table seems empty when I attempt to look at it. Now its up to LSI.
I guess I'll have to break down and buy another raid card. The reviews on promise and hot point look bad. I might just buy another silicon images board, but make it for PCI Express instead of PCI... I don't want to because the one I used for several years died... _________________ Some day there will only be free software. |
|
Back to top |
|
|
Mad Merlin Veteran
Joined: 09 May 2005 Posts: 1155
|
|
Back to top |
|
|
RayDude Advocate
Joined: 29 May 2004 Posts: 2066 Location: San Jose, CA
|
Posted: Tue Jan 01, 2013 8:45 pm Post subject: |
|
|
Final Update. My older card would not recognize 3TB even with the IT firmware. I sent it back and ordered a Marvell SAS card. It works okay, but the performance is kinda random. For example the SSD plugged into the MVSAS card is about half as fast as it was on the motherboard. But it wasn't always this slow... I'm having trouble figuring out how to get full performance out of it. _________________ Some day there will only be free software. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|