Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
SOLVED: Broken mirror - mdadm raid1 problem
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Woolong
n00b
n00b


Joined: 03 Feb 2004
Posts: 62
Location: Hong Kong

PostPosted: Tue Oct 25, 2005 10:08 am    Post subject: SOLVED: Broken mirror - mdadm raid1 problem Reply with quote

Hi,

It's a simple two disk raid1 setup. Each disk has 7 partitions, same size on both disks. sda4 and sdb4 are the "container" partitions that include partitions 5-8. I followed the gentoo softraid guide, and used mdadm to create the raid1 array.
Code:

cat /etc/mdadm.conf
DEVICE /dev/sda* /dev/sdb*
ARRAY /dev/md0 level=raid1 num-devices=2 devices=/dev/sda1,/dev/sda1
ARRAY /dev/md1 level=raid1 num-devices=2 devices=/dev/sda2,/dev/sdb2
ARRAY /dev/md2 level=raid1 num-devices=2 devices=/dev/sda3,/dev/sdb3
ARRAY /dev/md3 level=raid1 num-devices=2 devices=/dev/sda5,/dev/sdb5
ARRAY /dev/md4 level=raid1 num-devices=2 devices=/dev/sda6,/dev/sdb6
ARRAY /dev/md5 level=raid1 num-devices=2 devices=/dev/sda7,/dev/sdb7
ARRAY /dev/md6 level=raid1 num-devices=2 devices=/dev/sda8,/dev/sdb8

The system boots and runs okay, but for some reason the sdb* partitions are not "up".
Code:

cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda2[0]
      1003968 blocks [2/1] [U_]

md2 : active raid1 sda3[0]
      505920 blocks [2/1] [U_]

md3 : active raid1 sda5[0]
      60556864 blocks [2/1] [U_]

md4 : active raid1 sda6[0]
      8008256 blocks [2/1] [U_]

md5 : active raid1 sda7[0]
      2008000 blocks [2/1] [U_]

md6 : active raid1 sda8[0]
      449664 blocks [2/1] [U_]

md0 : active raid1 sda1[0]
      72192 blocks [2/1] [U_]

This md6 is supposed to have both sda8 and sdb8, but at the bottom it says "removed".
Code:

mdadm /dev/md6
/dev/md6: 439.13MiB raid1 2 devices, 0 spares. Use mdadm --detail for more detail.
/dev/md6: No md super block found, not an md component.

Code:

mdadm --detail /dev/md6
/dev/md6:
        Version : 00.90.01
  Creation Time : Tue Oct 25 16:36:04 2005
     Raid Level : raid1
     Array Size : 449664 (439.20 MiB 460.46 MB)
    Device Size : 449664 (439.20 MiB 460.46 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 6
    Persistence : Superblock is persistent

    Update Time : Wed Oct 26 17:00:45 2005
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : e4dd3202:3e737875:5914cfa4:71d8b905
         Events : 0.731

    Number   Major   Minor   RaidDevice State
       0       8        8        0      active sync   /dev/sda8
       1       0        0        -      removed

I checked the man page, and tried to get as much info as possible...
Code:

mdadm --examine /dev/sdb8
/dev/sdb8:
          Magic : a92b4efc
        Version : 00.90.01
           UUID : e4dd3202:3e737875:5914cfa4:71d8b905
  Creation Time : Tue Oct 25 16:36:04 2005
     Raid Level : raid1
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 6

    Update Time : Tue Oct 25 17:58:53 2005
          State : clean
 Active Devices : 1
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 1e2c541b - correct
         Events : 0.555


      Number   Major   Minor   RaidDevice State
this     2       8       24        2      spare   /dev/sdb8

   0     0       8        8        0      active sync   /dev/sda8
   1     1       0        0        1      faulty removed
   2     2       8       24        2      spare   /dev/sdb8

From dmesg, I realized all the sdb* partitions are bind to the array at first, then got kicked out because of being "non-fresh".
Code:

md: Autodetecting RAID arrays.
md: autorun ...
md: considering sdb8 ...
md:  adding sdb8 ...
md: sdb7 has different UUID to sdb8
md: sdb6 has different UUID to sdb8
md: sdb5 has different UUID to sdb8
md: sdb3 has different UUID to sdb8
md: sdb2 has different UUID to sdb8
md: sdb1 has different UUID to sdb8
md:  adding sda8 ...
md: sda7 has different UUID to sdb8
md: sda6 has different UUID to sdb8
md: sda5 has different UUID to sdb8
md: sda3 has different UUID to sdb8
md: sda2 has different UUID to sdb8
md: sda1 has different UUID to sdb8
md: created md6
md: bind<sda8>
md: bind<sdb8>
md: running: <sdb8><sda8>
md: kicking non-fresh sdb8 from array!
md: unbind<sdb8>
md: export_rdev(sdb8)
raid1: raid set md6 active with 1 out of 2 mirrors
md: considering sdb7 ...
md:  adding sdb7 ...
md: sdb6 has different UUID to sdb7
md: sdb5 has different UUID to sdb7
md: sdb3 has different UUID to sdb7
md: sdb2 has different UUID to sdb7
md: sdb1 has different UUID to sdb7
md:  adding sda7 ...
md: sda6 has different UUID to sdb7
md: sda5 has different UUID to sdb7
md: sda3 has different UUID to sdb7
md: sda2 has different UUID to sdb7
md: sda1 has different UUID to sdb7
md: created md5
md: bind<sda7>
md: bind<sdb7>
md: running: <sdb7><sda7>
md: kicking non-fresh sdb7 from array!
md: unbind<sdb7>
md: export_rdev(sdb7)
raid1: raid set md5 active with 1 out of 2 mirrors
md: considering sdb6 ...
md:  adding sdb6 ...
md: sdb5 has different UUID to sdb6
md: sdb3 has different UUID to sdb6
md: sdb2 has different UUID to sdb6
md: sdb1 has different UUID to sdb6
md:  adding sda6 ...
md: sda5 has different UUID to sdb6
md: sda3 has different UUID to sdb6
md: sda2 has different UUID to sdb6
md: sda1 has different UUID to sdb6
md: created md4
md: bind<sda6>
md: bind<sdb6>
md: running: <sdb6><sda6>
md: kicking non-fresh sdb6 from array!
md: unbind<sdb6>
md: export_rdev(sdb6)
raid1: raid set md4 active with 1 out of 2 mirrors
md: considering sdb5 ...
md:  adding sdb5 ...
md: sdb3 has different UUID to sdb5
md: sdb2 has different UUID to sdb5
md: sdb1 has different UUID to sdb5
md:  adding sda5 ...
md: sda3 has different UUID to sdb5
md: sda2 has different UUID to sdb5
md: sda1 has different UUID to sdb5
md: created md3
md: bind<sda5>
md: bind<sdb5>
md: running: <sdb5><sda5>
md: kicking non-fresh sdb5 from array!
md: unbind<sdb5>
md: export_rdev(sdb5)
raid1: raid set md3 active with 1 out of 2 mirrors
md: considering sdb3 ...
md:  adding sdb3 ...
md: sdb2 has different UUID to sdb3
md: sdb1 has different UUID to sdb3
md:  adding sda3 ...
md: sda2 has different UUID to sdb3
md: sda1 has different UUID to sdb3
md: created md2
md: bind<sda3>
md: bind<sdb3>
md: running: <sdb3><sda3>
md: kicking non-fresh sdb3 from array!
md: unbind<sdb3>
md: export_rdev(sdb3)
raid1: raid set md2 active with 1 out of 2 mirrors
md: considering sdb2 ...
md:  adding sdb2 ...
md: sdb1 has different UUID to sdb2
md:  adding sda2 ...
md: sda1 has different UUID to sdb2
md: created md1
md: bind<sda2>
md: bind<sdb2>
md: running: <sdb2><sda2>
md: kicking non-fresh sdb2 from array!
md: unbind<sdb2>
md: export_rdev(sdb2)
raid1: raid set md1 active with 1 out of 2 mirrors
md: considering sdb1 ...
md:  adding sdb1 ...
md:  adding sda1 ...
md: created md0
md: bind<sda1>
md: bind<sdb1>
md: running: <sdb1><sda1>
md: kicking non-fresh sdb1 from array!
md: unbind<sdb1>
md: export_rdev(sdb1)
raid1: raid set md0 active with 1 out of 2 mirrors
md: ... autorun DONE.

I tried commands like mdadm /dev/md6 -a /dev/sdb8, but it won't bind the partitions on sdb. :(
Any idea? Please help!


Last edited by Woolong on Sat Nov 05, 2005 4:31 am; edited 1 time in total
Back to top
View user's profile Send private message
bLanark
Apprentice
Apprentice


Joined: 27 Aug 2002
Posts: 181
Location: Royal Berkshire, UK

PostPosted: Thu Nov 03, 2005 12:08 am    Post subject: Me too Reply with quote

I've just encountered this too. I thought I'd just give this a bump and see if anyone notices it. . . :-)
_________________
.sig: access denied
Back to top
View user's profile Send private message
bLanark
Apprentice
Apprentice


Joined: 27 Aug 2002
Posts: 181
Location: Royal Berkshire, UK

PostPosted: Thu Nov 03, 2005 12:19 am    Post subject: Solved? Reply with quote

I've just discovered this little snippet:

Code:

raidhotadd /dev/md0 /dev/hdg1


Of course, you'll need to use your own device names in this command.

The rebuild takes some time; you can check progress:

Code:

# cat /proc/mdstat
Personalities : [raid1] [raid5] [raid6] [raid10]
md0 : active raid1 hdg1[2] hde1[0]
      245111616 blocks [2/1] [U_]
      [>....................]  recovery =  4.1% (10191616/245111616) finish=66.2min speed=59105K/sec


I guess I'll know in the morning how it went :-)

If more than one array has failed, then I'd advise you to rebuild your arrays one at a time, especially if they share physical drives.
_________________
.sig: access denied
Back to top
View user's profile Send private message
bLanark
Apprentice
Apprentice


Joined: 27 Aug 2002
Posts: 181
Location: Royal Berkshire, UK

PostPosted: Fri Nov 04, 2005 11:09 pm    Post subject: All Cool and Froody Reply with quote

Well, that worked well, and now I have:

Code:

# cat /proc/mdstat
Personalities : [raid1] [raid5] [raid6] [raid10]
md0 : active raid1 hdg1[1] hde1[0]
      245111616 blocks [2/2] [UU]


Yay!

(Hopefully I'll survive the next reboot too)
_________________
.sig: access denied
Back to top
View user's profile Send private message
Woolong
n00b
n00b


Joined: 03 Feb 2004
Posts: 62
Location: Hong Kong

PostPosted: Sat Nov 05, 2005 4:30 am    Post subject: Reply with quote

Thanks for the reply. After waiting for a couple of days without getting any post, I gave up on the forums and turned to the softraid-howto http://tldp.org/HOWTO/Software-RAID-HOWTO.html

I don't have raidtool, only emerged mdadm. I think the command you listed above is pretty much the same as this one:
Code:

mdadm /dev/md6 --add /dev/sdb8

It failed to bind the second device to the mirror array.

I thought maybe there is something wrong with the superblock, so I tried to wipe out the whole drive using fdisk. Delete all partitions on the disk, and partition it the same way as the first disk. Reboot and tried the command above again, still wouldn't bind. To my surprise, when I mounted partitions on sdb, some data are still left!

I thought the drive was wiped out when I delete all the partitions with fdisk. Guess not.

I've been presuming the drive was good because it's brand new. Since I have some spare drives, it would hurt to give them a try. Upon replacing the drive, the mirror array finally accepts the second device. I haven't got the time to verify whether it's a software config issue or a hardware one...

I'd like to point out the Gentoo softraid guide http://www.gentoo.org/doc/en/gentoo-x86-tipsntricks.xml#software-raid does mention that you need to install MBR on both drives if you are mirroring. But the instructions aren't there, or I'm too ignorant. Either way, here's what I learnt from the more complete Linux software-raid guide.
Code:

grub
grub>device (hd0) /dev/sdb        (sdb is the second drive of my mirror)
grub>root (hd0,0)       
grub>setup (hd0)
Back to top
View user's profile Send private message
bol
n00b
n00b


Joined: 27 Dec 2004
Posts: 26
Location: Stockholm, Sweden

PostPosted: Tue Aug 15, 2006 12:58 pm    Post subject: Reply with quote

Woolong wrote:

I don't have raidtool, only emerged mdadm. I think the command you listed above is pretty much the same as this one:
Code:

mdadm /dev/md6 --add /dev/sdb8




I have exactly the same problem, the secondary disc isn't syncing.
Well, one partition syncs, but non of the rest.
Wish me luck..

Thanks
Back to top
View user's profile Send private message
bol
n00b
n00b


Joined: 27 Dec 2004
Posts: 26
Location: Stockholm, Sweden

PostPosted: Tue Sep 05, 2006 2:51 pm    Post subject: Reply with quote

Now i'm there again..
The discs are not syncing.

And smartctl gives this output:
Code:
tw0t ~ # smartctl -H /dev/hdc
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.


mdstat:
Code:
tw0t ~ # cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hdc1[2](F) hda1[0]
      56128 blocks [2/1] [U_]

md3 : active raid1 hdc3[2](F) hda3[0]
      5855616 blocks [2/1] [U_]

md5 : active raid1 hdc5[2](F) hda5[0]
      19534912 blocks [2/1] [U_]

md6 : active raid1 hdc6[2](F) hda6[0]
      39061952 blocks [2/1] [U_]

md7 : active raid1 hdc7[2](F) hda7[0]
      11679104 blocks [2/1] [U_]

md8 : active raid1 hdc8[2](F) hda8[0]
      979840 blocks [2/1] [U_]


And dmesg is giving this:
Code:

end_request: I/O error, dev hdc, sector 112191
raid1: hdc1: rescheduling sector 112128
end_request: I/O error, dev hdc, sector 112193
raid1: hdc1: rescheduling sector 112130
end_request: I/O error, dev hdc, sector 112195
raid1: hdc1: rescheduling sector 112132
end_request: I/O error, dev hdc, sector 112197
raid1: hdc1: rescheduling sector 112134
end_request: I/O error, dev hdc, sector 112191
end_request: I/O error, dev hdc, sector 112191
raid1: Disk failure on hdc1, disabling device.
        Operation continuing on 1 devices
raid1: hda1: redirecting sector 112128 to another mirror
raid1: hda1: redirecting sector 112130 to another mirror
raid1: hda1: redirecting sector 112132 to another mirror
raid1: hda1: redirecting sector 112134 to another mirror
RAID1 conf printout:
 --- wd:1 rd:2
 disk 0, wo:0, o:1, dev:hda1
 disk 1, wo:1, o:0, dev:hdc1
RAID1 conf printout:
 --- wd:1 rd:2
 disk 0, wo:0, o:1, dev:hda1
end_request: I/O error, dev hdc, sector 112319
Buffer I/O error on device hdc1, logical block 14032
end_request: I/O error, dev hdc, sector 112319
Buffer I/O error on device hdc1, logical block 14032
end_request: I/O error, dev hdc, sector 112439
end_request: I/O error, dev hdc, sector 112439
printk: 2 messages suppressed.
Buffer I/O error on device hdc2, logical block 244960
end_request: I/O error, dev hdc, sector 2072135
end_request: I/O error, dev hdc, sector 2072367
end_request: I/O error, dev hdc, sector 2072367
printk: 3 messages suppressed.
Buffer I/O error on device hdc3, logical block 11711232
end_request: I/O error, dev hdc, sector 13783618
Buffer I/O error on device hdc3, logical block 11711233
end_request: I/O error, dev hdc, sector 13783619
Buffer I/O error on device hdc3, logical block 11711234
end_request: I/O error, dev hdc, sector 13783620
Buffer I/O error on device hdc3, logical block 11711235
end_request: I/O error, dev hdc, sector 13783621
Buffer I/O error on device hdc3, logical block 11711236
end_request: I/O error, dev hdc, sector 13783622
Buffer I/O error on device hdc3, logical block 11711237
end_request: I/O error, dev hdc, sector 13783623
Buffer I/O error on device hdc3, logical block 11711238
end_request: I/O error, dev hdc, sector 13783624
end_request: I/O error, dev hdc, sector 13783617
end_request: I/O error, dev hdc, sector 13783618
end_request: I/O error, dev hdc, sector 13783619
........


I guess the disk is f*cked right?


Last edited by bol on Wed Sep 06, 2006 12:05 pm; edited 1 time in total
Back to top
View user's profile Send private message
bol
n00b
n00b


Joined: 27 Dec 2004
Posts: 26
Location: Stockholm, Sweden

PostPosted: Wed Sep 06, 2006 12:01 pm    Post subject: Reply with quote

*bump*
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum