Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
RAID recovered from a failure, but what about the failure?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
ptbarnett
n00b
n00b


Joined: 24 Nov 2002
Posts: 25

PostPosted: Wed Apr 16, 2003 6:17 am    Post subject: RAID recovered from a failure, but what about the failure? Reply with quote

After running on a single IDE disk for about a month (as /dev/hda), I decided to add another disk for redundancy. After poking around the 'Net, I found the necessary steps to create a RAID device out of the new disk (on /dev/hdc), copy all data from /dev/hda to it, then use raidhotadd to tie both disks together.

The next day, I got the following failure in the syslog:

Code:
Apr 14 10:54:52 home kernel: hda: timeout waiting for DMA
Apr 14 10:54:52 home kernel: hda: status timeout: status=0xd0 { Busy }
Apr 14 10:54:52 home kernel: hda: drive not ready for command


followed by many:

Code:
Apr 14 10:54:52 home kernel: end_request: I/O error, dev 03:03 (hda), sector xxxx


for about 15 sectors. Then, I got the following every 30 minutes:

Code:
Apr 14 13:43:13 home kernel: end_request: I/O error, dev 03:00 (hda), sector 0


Since I use the system as a server and don't even look at the console, I didn't see the messages telling me that the RAID management had disabled one of the mirrors. I just happened to walk by the server and noticed continuous disk activity 1-1/2 days after the failure and went to investigate.

I had enabled SMART on the drives, but was unable to get any information from the disabled drive. I finally rebooted and the system came up with no problem, with the failing mirror disabled. smartctl /dev/hda didn't show anything unusual: the drive itself hasn't logged any errors.

I've done a raidhotadd /dev/md{2,3} /dev/hda{2,3} to restore the mirror on /dev/hda. Everything appears to be back to normal. But, I'm wondering what I should be worrying about? Did I have some sort of controller failure? Or an IDE driver failure?

It appears that RAID performed exactly as designed: it ran so well on the remaining drive that I didn't even notice. :) But, I'd like to understand the failure and take the necessary steps to prevent it, in case the next one isn't so forgiving.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum