Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Broke my dmraid? Help please!
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Installing Gentoo
View previous topic :: View next topic  
Author Message
E-Razor
n00b
n00b


Joined: 11 Jul 2004
Posts: 50

PostPosted: Mon Jul 08, 2013 4:46 pm    Post subject: Broke my dmraid? Help please! Reply with quote

Hi all,

I'm having quite a serious problem.

Today my server hung and I rebooted it - without any luck. Since it's a root server I started the recovery image and tried to mount my root-partition.

I am using dmraid, and unfortunatelly started with:
mdadm --create --level=1 --disk-count=2 /dev/md0 /dev/sda2 /dev/sdb2

It tried to resync, and I had to reboot. After the reboot I tried with:
mdadm --assemble /dev/md0 /dev/sda2 /dev/sdb2

The resync took quite long and afterwards I'm still not able to mount /dev/md0 .

Kernel log is:
Code:

[  156.281879] md: md0 stopped.
[  156.282986] md: bind<sdb2>
[  156.283161] md: bind<sda2>
[  156.297031] md: raid1 personality registered for level 1
[  156.299897] md/raid1:md0: not clean -- starting background reconstruction
[  156.299904] md/raid1:md0: active with 2 out of 2 mirrors
[  156.299942] md0: detected capacity change from 0 to 248841306112
[  156.307937] md: resync of RAID array md0
[  156.307944] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[  156.307947] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
[  156.307952] md: using 128k window, over a total of 243009088k.
[  156.417063]  md0: unknown partition table
[  935.195018] SQUASHFS error: Can't find a SQUASHFS superblock on md0
[  935.195661] EXT4-fs (md0): VFS: Can't find ext4 filesystem
[  937.801446] EXT4-fs (md0): VFS: Can't find ext4 filesystem
[ 5023.803727] md: md0: resync done.
[ 5023.892662] RAID1 conf printout:
[ 5023.892665]  --- wd:2 rd:2
[ 5023.892668]  disk 0, wo:0, o:1, dev:sda2
[ 5023.892671]  disk 1, wo:0, o:1, dev:sdb2
[ 5360.859027] EXT4-fs (md0): VFS: Can't find ext4 filesystem
[ 5363.666290] SQUASHFS error: Can't find a SQUASHFS superblock on md0
[ 5363.666867] EXT4-fs (md0): VFS: Can't find ext4 filesystem



I'd appreciate any help.

Thanks a lot!
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 31359
Location: 56N 3W

PostPosted: Mon Jul 08, 2013 6:10 pm    Post subject: Reply with quote

E-Razor,

mdadm --create is a very bad thing to do. It writes new raid metadata, which in effect, destroys your old raid. The sync won't have helped either.

However, all may not be lost. Creating new raid metadata is harmless *if* its identical to the old metadata. User data is not harmed in the process.
The downside is that mdadms defaults changed a few months ago, so if your original raid was a year or more old and you did not specify the parameters explicitly, you now have raid metadata version 1.2 but the old one was version 0.9.

So when did you create the old raid and how?

It gets slightly worse. Raid version 0.9 metadata is written at the end of the volume and the filesystem starts in the usual place, as if the volume is not a member of a raid set.
Raid version 1.2 metadata is written at the start of the volume and tramples over the primary extX filesystem superblock, that means you can no longer mount the filesystem using the primary superblock, which is what the standard invocation of mount does.

If we know what you used to have, something may be recoverable.

Your partition table would also be useful but I suppose that is inside the raid set and no longer available.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
E-Razor
n00b
n00b


Joined: 11 Jul 2004
Posts: 50

PostPosted: Mon Jul 08, 2013 6:18 pm    Post subject: Reply with quote

I think the versions are the same since it's also the same rescue image.

I created it about 1 year ago.

Fstab looks like this:
Code:

root@grml ~ # fdisk -l                                                                                             :(

Disk /dev/sda: 250.1 GB, 250059350016 bytes
224 heads, 56 sectors/track, 38934 cylinders, total 488397168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x52c44f76

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1              56     2107391     1053668   82  Linux swap / Solaris
/dev/sda2   *     2107392   488388095   243140352   fd  Linux raid autodetect

Disk /dev/sdb: 250.1 GB, 250059350016 bytes
224 heads, 56 sectors/track, 38934 cylinders, total 488397168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x9fa5628b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1              56     2107391     1053668   82  Linux swap / Solaris
/dev/sdb2   *     2107392   488388095   243140352   fd  Linux raid autodetect



Good thing is, that i was able to mount sda1 after I did:
# mdadm --stop /dev/md0
and
# e2fsck /dev/sda1

I did not finish e2fsck, it told me about a second block which it used and all the other questions I answered with "no". Then I was able to mount again.

I'm going to backup as much as possbile now.

Next step would be to enable md0 again, maybe I can also fsck the md0 which could fix my filesystem.

Do you think this would help?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 31359
Location: 56N 3W

PostPosted: Mon Jul 08, 2013 8:25 pm    Post subject: Reply with quote

E-Razor,

If you allowed half the raid to mount rw, the two mirrors are now out of sync.
Recover what you can from /dev/sda2. Its very important that you do not write to a damaged raid/filesystem until you understand the damage.

I suspect you used to have raid superblock version 0.9 but now you have version 1.2
What does
Code:
mdadm -E /dev/sdb2
show?

It sounds like fsck repaired your filesystem superblock damaged as I described above by writing a raid 1.2 superblock in the middle of it.

What counts is not the rescue image you used but the versions of mdadm.

How did you start your raid?
With kernel raid auto assemble or some other way?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
E-Razor
n00b
n00b


Joined: 11 Jul 2004
Posts: 50

PostPosted: Tue Jul 09, 2013 11:10 am    Post subject: Reply with quote

I finally got it to work again.

Thanks for your hints!

I had version 0.9 and wrote "mdadm --create ..." which confused my system.

However, after e2fsck of one of the partions I was able to get my data back. I disconnected the working partition from the raid and formatted the raid again, then copied the old files into the empty raid-partiotion again.

My init was broken and I simply did a chroot and emerge which helped and the system is online now.

It seems I've had a lot of luck that the --create did not destroy the partition.
Back to top
View user's profile Send private message
iandoug
Apprentice
Apprentice


Joined: 11 Feb 2005
Posts: 290
Location: Cape Town, South Africa

PostPosted: Sun Sep 22, 2013 1:45 pm    Post subject: Reply with quote

NeddySeagoon wrote:


It gets slightly worse. Raid version 0.9 metadata is written at the end of the volume and the filesystem starts in the usual place, as if the volume is not a member of a raid set.
Raid version 1.2 metadata is written at the start of the volume and tramples over the primary extX filesystem superblock, that means you can no longer mount the filesystem using the primary superblock, which is what the standard invocation of mount does.

If we know what you used to have, something may be recoverable.

Your partition table would also be useful but I suppose that is inside the raid set and no longer available.


I did a normal update and noticed portage wanted me to update mdadm.conf, which I did ... I accepted the new version as I did not notice anything unusual.

Now I can't see my drives... my /home is on them.

I get the message in dmesg about "invalid raid superblock magic" .... sdb1 and sdc1 does not have a valid 0.9 superblock and not imported.

What you describe the new version doing sounds like the height of dumb to me unless there was a way to automagically deal with existing disks.

Installed version of mdadm is 3.2.6

/dev has md, md0 and md127, while fstab has /dev/md1

What can a desperate person do under these conditions? I need the box to work... :-)

thanks, Ian
_________________
Asus M3A78 64, X2 6000+, PX9800 GT, 4GB Ram | Asus M4A77TD PRO, X2 245, HD4350, 4GB RAM
Back to top
View user's profile Send private message
iandoug
Apprentice
Apprentice


Joined: 11 Feb 2005
Posts: 290
Location: Cape Town, South Africa

PostPosted: Sun Sep 22, 2013 1:51 pm    Post subject: would downgrading help? Reply with quote

would it help to dowgrade back to mdadm 3.1.4?
_________________
Asus M3A78 64, X2 6000+, PX9800 GT, 4GB Ram | Asus M4A77TD PRO, X2 245, HD4350, 4GB RAM
Back to top
View user's profile Send private message
Jaglover
Advocate
Advocate


Joined: 29 May 2005
Posts: 4563
Location: Saint Amant, Acadiana

PostPosted: Sun Sep 22, 2013 2:21 pm    Post subject: Reply with quote

Quote:
/dev has md, md0 and md127, while fstab has /dev/md1


It is possible it is md0 or md127 now, did you look at those volumes?
_________________
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
iandoug
Apprentice
Apprentice


Joined: 11 Feb 2005
Posts: 290
Location: Cape Town, South Africa

PostPosted: Sun Sep 22, 2013 2:38 pm    Post subject: solved Reply with quote

iandoug wrote:

What can a desperate person do under these conditions? I need the box to work... :-)



Edit mdadm.conf and specify the DEVICEs and ARRAY and reboot ...

I guess the etc-update step changed those lines and I didn't notice ...

what a relief.

cheers, Ian
_________________
Asus M3A78 64, X2 6000+, PX9800 GT, 4GB Ram | Asus M4A77TD PRO, X2 245, HD4350, 4GB RAM
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Installing Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum