View previous topic :: View next topic |
Author |
Message |
1U Guru
Joined: 21 Jul 2005 Posts: 319
|
Posted: Sun Dec 04, 2005 1:53 am Post subject: Software raid woes, loosing all my data... :( HELP |
|
|
The file /etc/raidtab was somehow modified by something other than me. Later after a kernel recompile and reboot the md1 2 disk raid1 is showing up as a 3 disk (1 missing of course) raid0. I disabled raid0 support in the kernel just in case, and stopped everything to prevent any further damage. I haven't done any actual writing to the drives so I'm sure the data and structure is safe, I just need help figuring out how to reconfigure the raid settings. Does anyone have any ideas and suggestions? I have both raidtools and mdadm installed but I'm still fairly new with software raid and not sure what to do while at the same time cautious not to loose my data.
Any help would be appreciated.
Here is some more information:
Code: | /proc/mdstat
Personalities : [raid1]
unused devices: <none> |
Code: | /etc/raidtab
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
chunk-size 32
persistent-superblock 1
device /dev/sdb1
raid-disk 0
device /dev/sdc1
raid-disk 1 |
Code: | Disk /dev/sdb: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 30401 244196001 fd Linux raid autodetect
Disk /dev/sdc: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 1 30401 244196001 fd Linux raid autodetect |
Code: | mdadm -Q --examine /dev/sdb1
/dev/sdb1:
Magic : a92b4efc
Version : 00.90.02
UUID : 494993cb:057174cb:381b694e:5486ff19
Creation Time : Fri Dec 2 10:30:23 2005
Raid Level : raid0
Device Size : 244195904 (232.88 GiB 250.06 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Update Time : Fri Dec 2 10:30:23 2005
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 1a38356f - correct
Events : 0.1
Chunk Size : 32K
Number Major Minor RaidDevice State
this 0 8 113 0 active sync
0 0 8 113 0 active sync
1 1 8 161 1 active sync |
Code: | mdadm -Q --examine /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.02
UUID : 494993cb:057174cb:381b694e:5486ff19
Creation Time : Fri Dec 2 10:30:23 2005
Raid Level : raid0
Device Size : 244195904 (232.88 GiB 250.06 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Update Time : Fri Dec 2 10:30:23 2005
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 1a3835a1 - correct
Events : 0.1
Chunk Size : 32K
Number Major Minor RaidDevice State
this 1 8 161 1 active sync
0 0 8 113 0 active sync
1 1 8 161 1 active sync |
raidstart /dev/md0 creates the following output in dmesg:
Code: | md: could not open unknown-block(8,113).
md: could not open unknown-block(8,161).
md: autorun ...
md: considering sdc1 ...
md: adding sdc1 ...
md: created md1
md: bind<sdc1>
md: running: <sdc1>
md: personality 2 is not loaded!
md: do_md_run() returned -22
md: md1 stopped.
md: unbind<sdc1>
md: export_rdev(sdc1)
md: ... autorun DONE. |
Last edited by 1U on Sun Dec 04, 2005 7:16 am; edited 4 times in total |
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Sun Dec 04, 2005 2:54 am Post subject: |
|
|
add something like
Code: |
ARRAY /dev/md/1 devices=/dev/sdb1,/dev/sdc1
DEVICE /dev/sd*
|
to /etc/mdadm.conf and # mdadm -A -s |
|
Back to top |
|
|
1U Guru
Joined: 21 Jul 2005 Posts: 319
|
Posted: Sun Dec 04, 2005 2:58 am Post subject: |
|
|
Thank you for your reply. Just 2 questions before I continue though. How would it know that this is a raid 1 instead of raid 0? And would it have to literally rebuild the array as in rewrite all that data or will it simply change their configuration? |
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Sun Dec 04, 2005 3:15 am Post subject: |
|
|
The newer raid model writes the information about the raid disk on each member in a header. It is possible that your disks have the old style raid on them in which case everything has to be spelled out for them. mdadm --build (mdadm --build -h for info) is the old style. mdadm --assemble (or mdadm -A) is the new one. |
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Sun Dec 04, 2005 3:19 am Post subject: |
|
|
Code: | persistent-superblock 1 |
I believe that this should have made a new raid disk. If not mdadm -A will not initiate the raid. |
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Sun Dec 04, 2005 3:24 am Post subject: |
|
|
there is inconsistancy in the raid levels
Code: |
/proc/mdstat
Personalities : [raid1]
|
Code: |
md: personality 2 is not loaded!
|
|
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Sun Dec 04, 2005 3:27 am Post subject: |
|
|
What driver/device are the disks connected to and which gcc did you compile w/? |
|
Back to top |
|
|
1U Guru
Joined: 21 Jul 2005 Posts: 319
|
Posted: Sun Dec 04, 2005 3:40 am Post subject: |
|
|
Thank you for your replies. My entire system is compiled from the start with gcc 3.4. This is connected to an nvidia sata controller which also has my single drive root that works without problems. It was working before and was always autodetected. I used raidtools mkraid to create the raid a few days ago so I'm not sure if that's the old or new approach. |
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Sun Dec 04, 2005 3:53 am Post subject: |
|
|
mdadm is the newer raid app but mkraid w/ the superblock should have made a newer raid disk. You may need to build the kernel w/ gcc-3.3*. What is dmesg? |
|
Back to top |
|
|
1U Guru
Joined: 21 Jul 2005 Posts: 319
|
Posted: Sun Dec 04, 2005 4:29 am Post subject: |
|
|
Why would I have to build the kernel with gcc 3.3 though? Dmesg is the output of the kernel, it's a command you can run and it show you the most recent events. It's basically the same as the kernel logs you can find in /var/log/everything and etc. |
|
Back to top |
|
|
1U Guru
Joined: 21 Jul 2005 Posts: 319
|
Posted: Sun Dec 04, 2005 5:42 am Post subject: |
|
|
Ok I went ahead and read up a little bit more about mdadm, and I now have the following in /etc/mdadm.conf
Code: | ARRAY /dev/md0 level=raid1 num-devices=2 devices=/dev/sdb1,/dev/sdc1 UUID=494993cb:057174cb:381b694e:5486ff19
DEVICE /dev/sd[bc]1 |
However, upon doing mdadm -Asf, I get:
Code: | mdadm: no devices found for /dev/md0 |
I know my raid used to be md1 but since everythign is so messed up anyways I figured I might as well make it md0 this time (I used to have md0 but not any more).
From what I'm guessing so far I've configured all the settings for mdadm properly, but it's still relying on the old messed up superblocks to tell it which raid it's part of? How do I force the options from the config onto them?
Edit:
Ok now I just zeroed my superblock since that was messed up anyways. How do I go about rebuilding it with a new fresh one? |
|
Back to top |
|
|
1U Guru
Joined: 21 Jul 2005 Posts: 319
|
Posted: Sun Dec 04, 2005 7:12 am Post subject: |
|
|
Great! Now it looks like I'm in even worse than when I started this thread. Looks like superblock is unrecoverable? First I tried recover-sb and that went through and then I tried the following without any results.
Code: | fsck.reiser4 --build-fs /dev/sdb1
*******************************************************************
This is an EXPERIMENTAL version of fsck.reiser4. Read README first.
*******************************************************************
Fscking the /dev/sdb1 block device.
Will check the consistency of the Reiser4 SuperBlock.
Will build the Reiser4 FileSystem.
Continue?
(Yes/No): yes
***** fsck.reiser4 started at Sat Dec 3 20:40:26 2005
Reiser4 fs was detected on /dev/sdb1.
Master super block (16):
magic: ReIsEr4
blksize: 4096
format: 0x0 (format40)
uuid: 49fb3835-7eb8-4419-bb30-71b1413c7e41
label: <none>
Format super block (17):
plugin: format40
description: Disk-format for reiser4.
magic: ReIsEr40FoRmAt
flushes: 18446744073709551615
mkfs id: 0x517aa8d8
blocks: 61046992
free blocks: 72032276070321805
root block: 18438022271198038272
tail policy: 0x2 (smart)
next oid: 0xdff3c2d7faf815df
file count: 10071385641674376066
tree height: 65535
key policy: LARGE
CHECKING STORAGE TREE
Warn : Reiser4 storage tree does not exist. Filter pass skipped.
Read nodes 0
Nodes left in the tree 0
Leaves of them 0, Twigs of them 0
Zeroed node pointers 1
Time interval: Sat Dec 3 20:40:41 2005 - Sat Dec 3 20:40:41 2005
CHECKING EXTENT REGIONS.
Read twigs 0
Time interval: Sat Dec 3 20:40:41 2005 - Sat Dec 3 20:40:41 2005
LOOKING FOR UNCONNECTED NODES
Read nodes 21809319
Good nodes 0
Leaves of them 0, Twigs of them 0
Time interval: Sat Dec 3 20:40:45 2005 - Sat Dec 3 21:07:09 2005
CHECKING EXTENT REGIONS.
Read twigs 0
Time interval: Sat Dec 3 21:07:09 2005 - Sat Dec 3 21:07:09 2005
INSERTING UNCONNECTED NODES
1. Twigs: done
2. Twigs by item: done
3. Leaves: done
4. Leaves by item: done
Twigs: read 0, inserted 0, by item 0, empty 0
Leaves: read 0, inserted 0, by item 0
Time interval: Sat Dec 3 21:07:13 2005 - Sat Dec 3 21:07:13 2005
Fatal: No reiser4 metadata were found. Semantic pass is skipped.
***** fsck.reiser4 finished at Sat Dec 3 21:07:13 2005
Closing fs...done
NO REISER4 METADATA WERE FOUND. FS RECOVERY IS NOT POSSIBLE. |
I still have one partition fromt he second mirror to try my last recovery attempt. Can anyone help? At this point I'd at least like to get my data out and move it temporarily so I can recreate this raid from scratch. |
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Sun Dec 04, 2005 9:04 am Post subject: |
|
|
I don't know where you got the f from but it wasn't this thread.
--force -f : Assemble the array even if some superblocks appear
: out-of-date. This involves modifying the superblocks.
The problems that you are seeing are kernel related. The kernel does all the work when starting the md and reading the superblock. Once the kernel is set straight it will work properly and the superblock will reappear. If you try and rebuild the superblock (as in modify it) it will not work. |
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Sun Dec 04, 2005 9:07 am Post subject: |
|
|
Code: | Dmesg is the output of the kernel, it's a command you can run and it show you the most recent events. It's basically the same as the kernel logs you can find in /var/log/everything and etc. |
well what are the most recent events? |
|
Back to top |
|
|
1U Guru
Joined: 21 Jul 2005 Posts: 319
|
Posted: Sun Dec 04, 2005 5:00 pm Post subject: |
|
|
It's not kernel related as I've had the same exact setup working before with the same kernel. But I'm not interested in making them work any more, how do I restore the super block and/or read data off of my one remaining drive? It won't mount the regular way. |
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Mon Dec 05, 2005 1:06 am Post subject: |
|
|
I always though raid 1 would be just like a normal disk to mount. You might want to make a raid1, write something on it and do to it what you would the other and see if it will still work.
I take no responsibility for any thing that comes from any of this:
http://www.tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-4.html (last updated march 2000 see man ckraid)
http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html (more recent 2004) (says mdadm -A -f which youve tried)
I would go w/ the mkraid raid1.conf -f --only-superblock before the ckraid --fix because it seems like the superblock is the issue.
(man raidreconf) You want superblock not the old style.
raidreconf -n newraidtab -m /dev/md? -e /dev/sd??
Code: |
mdadm [mode] <raiddevice> [options] <component-devices>
--help -h : General help message or, after above option, mode specific help message
--help-options : This help message
--version -V : Print version information for mdadm
--verbose -v : Be more verbose about what is happening
--brief -b : Be less verbose, more brief
--force -f : Override normal checks and be more forceful
--detail -D : Display details of an array
--examine -E : Examine superblock on an array component
--monitor -F : monitor (follow) some arrays
--query -Q : Display general information about how a device relates to the md driver
--assemble -A
--uuid= -u : uuid of array to assemble. Devices which don't have this uuid are excluded
--super-minor= -m : minor number to look for in super-block when choosing devices to use.
--config= -c : config file
--scan -s : scan config file for missing information
--run -R : Try to start the array even if not enough devices for a full array are present
--force -f : Assemble the array even if some superblocks appear out-of-date. This involves modifying the superblocks.
--update= -U : Update superblock: one of sparc2.2, super-minor or summaries
--create -C Create a new array
--chunk= -c : chunk size of kibibytes
--rounding= : rounding factor for linear array (==chunk size)
--level= -l : raid level: 0,1,4,5,6,linear,multipath and synonyms
--parity= -p : raid5/6 parity algorithm: {left,right}-{,a}symmetric
--layout= : same as --parity
--size= -z : Size (in K) of each drive in RAID1/4/5/6/10 - optional
--force -f : Honour devices as listed on command line. Don't insert a missing drive for RAID5.
--run -R : insist of running the array even if not all devices are present or some look odd.
--readonly -o : start the array readonly - not supported yet.
--raid-devices= -n : number of active devices in array
--spare-devices= -x : number of spares (eXtras) devices in initial array
--build -B Build a legacy array
--chunk= -c : chunk size of kibibytes
--rounding= : rounding factor for linear array (==chunk size)
--level= -l : 0, raid0, or linear
--raid-devices= -n : number of active devices in array
--misc
--query -Q : Display general information about how a device relates to the md driver
--detail -D : Display details of an array
--examine -E : Examine superblock on an array component
--zero-superblock : erase the MD superblock from a device.
--run -R : start a partially built array
--stop -S : deactivate array, releasing all resources
--readonly -o : mark array as readonly
--readwrite -w : mark array as readwrite
--test -t : exit status 0 if ok, 1 if degrade, 2 if dead, 4 if missing
--manage
--add -a : hotadd subsequent devices to the array
--remove -r : remove subsequent devices, which must not be active
--fail -f : mark subsequent devices a faulty
--set-faulty : same as --fail
--run -R : start a partially built array
--stop -S : deactivate array, releasing all resources
--readonly -o : mark array as readonly
--readwrite -w : mark array as readwrite
|
make raids like # mdadm -C /dev/md2 -l 1 -n 2 -v /dev/sda2 /dev/sdb2
The superblock has the info about the disk. All the mdadm.conf needs are the disks to be searched for superblocks. |
|
Back to top |
|
|
1U Guru
Joined: 21 Jul 2005 Posts: 319
|
Posted: Sun Dec 11, 2005 7:28 am Post subject: |
|
|
Thanks for the reply, I'm going to attempt to fix it again this time hopefully without further breaking it.
I'm using dd right now to restore the drive I messed up a bit more by using fsck. Once I have perfect duplicates I will attempt to recreate the super blocks. However, I've been reading that --only-superblock is depreciated and no longer an option. Do you have any other ideas I should try since that's no longer used? Those links are great and I will look more into them, however one of the things it mentions which is ckraid is very old and has been depreciated because it's functions are now part of the kernel raid support. |
|
Back to top |
|
|
augury l33t
Joined: 22 May 2004 Posts: 722 Location: philadelphia
|
Posted: Sun Dec 11, 2005 7:58 am Post subject: |
|
|
I don't know whats making the disk appear as a raid0. I assume that at one point it was a raid1, and may still be. I have no idea if this will corrupt data or not but just remaking the disk in the way it was made originally should correct the problem. If you didn't do anything to the disk or alter config files, I would say that you have a bad kernel driver and the only thing that you could do is get a working kernel. Just for the heck of it, to verify the kernel, if you haven't already, make a raid1 disk, mount it, reboot and mount it again. |
|
Back to top |
|
|
1U Guru
Joined: 21 Jul 2005 Posts: 319
|
Posted: Sun Dec 11, 2005 4:12 pm Post subject: |
|
|
It's a bit too late for that though. As I've previously stated I zeroed the super block .. How would I go about rebuilding it? |
|
Back to top |
|
|
|