Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] linux raid failure
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Vieri
Guru
Guru


Joined: 18 Dec 2005
Posts: 490

PostPosted: Sun May 19, 2013 12:37 pm    Post subject: [SOLVED] linux raid failure Reply with quote

Hi,

Before I start messing with mdadm I'd like to make sure I do the right thing to check and array and eventually rebuild it without data loss.

mdadm shows the following:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
md1 : active raid1 dm-8[1]
244485120 blocks [2/1] [_U]

md0 : active raid1 dm-1[1] dm-0[0]
104320 blocks [2/2] [UU]

unused devices: <none>

mdadm --examine for sda3 shows this:

/dev/sda3:
Magic : a92b4efc
Version : 00.90.00
UUID : fa73549f:fea4b6f2:4ea64123:b2cb5dc4
Creation Time : Sun Mar 26 13:32:20 2006
Raid Level : raid1
Used Dev Size : 244485120 (233.16 GiB 250.35 GB)
Array Size : 244485120 (233.16 GiB 250.35 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1

Update Time : Sun Apr 21 13:13:09 2013
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 47e266eb - correct
Events : 0.39013


Number Major Minor RaidDevice State
this 0 253 7 0 active sync /dev/mapper/sda3

0 0 253 7 0 active sync /dev/mapper/sda3
1 1 253 8 1 active sync /dev/mapper/sdb3

mdadm --examine sdb3:

/dev/sdb3:
Magic : a92b4efc
Version : 00.90.00
UUID : fa73549f:fea4b6f2:4ea64123:b2cb5dc4
Creation Time : Sun Mar 26 13:32:20 2006
Raid Level : raid1
Used Dev Size : 244485120 (233.16 GiB 250.35 GB)
Array Size : 244485120 (233.16 GiB 250.35 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 1

Update Time : Sun May 19 14:06:41 2013
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Checksum : 4807ff51 - correct
Events : 0.40352


Number Major Minor RaidDevice State
this 1 253 8 1 active sync /dev/mapper/sdb3

0 0 0 0 0 removed
1 1 253 8 1 active sync /dev/mapper/sdb3

dmesg doesn't reveal any errors regarding /dev/sda.

Also, /dev/sda1 seems to be ok:

/dev/sda1:
Magic : a92b4efc
Version : 00.90.00
UUID : 9845034b:0951e9ea:f7e4b334:39c9a41c
Creation Time : Sun Mar 26 13:21:50 2006
Raid Level : raid1
Used Dev Size : 104320 (101.89 MiB 106.82 MB)
Array Size : 104320 (101.89 MiB 106.82 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0

Update Time : Sat Jan 26 18:49:43 2013
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 119cc0dd - correct
Events : 0.242


Number Major Minor RaidDevice State
this 0 253 0 0 active sync /dev/mapper/sda1

0 0 253 0 0 active sync /dev/mapper/sda1
1 1 253 1 1 active sync /dev/mapper/sdb1

as well as /dev/sdb1:

/dev/sdb1:
Magic : a92b4efc
Version : 00.90.00
UUID : 9845034b:0951e9ea:f7e4b334:39c9a41c
Creation Time : Sun Mar 26 13:21:50 2006
Raid Level : raid1
Used Dev Size : 104320 (101.89 MiB 106.82 MB)
Array Size : 104320 (101.89 MiB 106.82 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0

Update Time : Sat Jan 26 18:49:43 2013
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 119cc0e0 - correct
Events : 0.242


Number Major Minor RaidDevice State
this 1 253 1 1 active sync /dev/mapper/sdb1

0 0 253 0 0 active sync /dev/mapper/sda1
1 1 253 1 1 active sync /dev/mapper/sdb1

So, should I consider /dev/sda as a failed disk and physically replace it before re-creating the array?
Or should I make further tests to determine if /dev/sda3 is really flawed or not.
If so, how?

Thanks,

Vieri


Last edited by Vieri on Sun May 19, 2013 4:06 pm; edited 1 time in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 40979
Location: 56N 3W

PostPosted: Sun May 19, 2013 2:05 pm    Post subject: Reply with quote

Vieri,

sda is clearly not stone dead as /dev/sda1 is still a member of md0.

There would have been some valuable information in dmesg about why /dev/sda3 was dropped but I suppose thats gone now.

You may as well add /dev/sda3 back to the raid set and see what happens. The add process will rebuild the raid which will involve a complete write to /dev/sda3. If this fails, dmesg will be informative.

You can also install smartmontools and run smartctl to look at the drives internal error log. If you are going to add /dev/sda3 back to the raid, there is no point in running any of the smartctl tests - the rebuild will do a much better job of testing.

Its worth doing
Code:
echo repair  > /sys/block/mdX/md/sync_action
regularly to compare both drives and rewrite any failed blocks on one drive by copying them from the other. Ideally, it will never do anything but blocks do fail to read from time to time and this keeps your data in good shape.
Do keep an eye on the smartctl logs. A quickly rising reallocated event count is a cause for concern.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Vieri
Guru
Guru


Joined: 18 Dec 2005
Posts: 490

PostPosted: Sun May 19, 2013 4:07 pm    Post subject: Reply with quote

Thanks. Sync in progress...
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum