View previous topic :: View next topic |
Author |
Message |
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 6:59 pm Post subject: |
|
|
Hi Neddy,
both lines 5 and 197 still show 0 for the raw value, so I'll assume there were no write errors (I added the suspect drive to the array last, so it was written). (Incidentally, resync sped up considerably when I stopped boinc).
Okay, so I'll try tinderboxing (?) portage, then python, and see what that does. So, just unpack them from root?
If that fails, is "reinstall" something less drastic than "chroot in from a liveCD and start over from scratch"?
Cheers,
EE
UPDATE: for some reason, trying to reply boots me to the main index, so I'll reply here. Uh, progress! On a lark, I guessed that 'install' - as part of coreutils - might be broken. I used the tinderbox version, and now I've emerged portage to its latest version. I'll try to sync and see what happens.
Last edited by ExecutorElassus on Fri Apr 13, 2012 7:23 pm; edited 1 time in total |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Fri Apr 13, 2012 7:03 pm Post subject: |
|
|
ExecutorElassus,
Less haste. Get the right portage for you and unpack it to the root of your filesystem.
I posted the details earlier. You must use the p option to tar ir it still won't work as it will be unpacked with -x in the permissions. That means it won't eXecute, not even for root.
Then test - see what has changed if anything. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 7:37 pm Post subject: |
|
|
okay. I managed to emerge portage successfully, and then tried to re-emerge coreutils. That failed on a broken /usr/include/mntent.h, so now I'm emerging glibc. After that I'll try coreutils again.
It seems the re-syncing (or rather, several iterations of it, along with bad journal/fsck management on my part - "sure, just auto-fix everything!") has left some files corrupted. But if I can run emerge, and then rebuild the toolchain, I can start getting things put back together.
I'll keep you posted.
Thanks again,
EE
UPDATEokay, glibc won't install due to a broken /usr/include/mntent.h, which belongs to linux-headers. I can't emerge linux-headers due to a broken file belonging to glibc. Using tinderbox files for both of those results in the following error:
Code: | # emerge -p glibc
/usr/bin/python2.7: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/lib64/libpython2.7.so.1.0)
domo-kun / # emerge -p linux-hearders
/usr/bin/python2.7: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/lib64/libpython2.7.so.1.0)
|
So, what' my next step? I'm assuming reinstall. Can I do that without wiping everything? I have a system rescue CD I could use to re-install stuff. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Fri Apr 13, 2012 8:25 pm Post subject: |
|
|
ExecutorElassus
Can you do
Code: | cd /usr/portage
scripts/bootstrap.sh |
Thats a stage1 from a stage1 install. It builds your toolchain. Do not interrupt it - it must be run at one sitting as its not resumable. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 8:28 pm Post subject: |
|
|
Hrm: Apparently not:
Code: | # scripts/bootstrap.sh
/bin/bash: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /bin/bash)
/bin/bash: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /lib64/libreadline.so.6)
/bin/bash: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /lib64/libncurses.so.5)
| unless there's a tinderbox version of glibc-2.14 somewhere?
Cheers,
EE |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Fri Apr 13, 2012 8:29 pm Post subject: |
|
|
ExecutorElassus,
What version of glibc did you have?
and what version do you have now?
glibc must not be downgraded. I guess you used to have glibc-2.14 and have a lower version now? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 8:34 pm Post subject: |
|
|
I had glibc-2.14.1-r2, but the tinderbox version was glibc-2.13-r4 (and is thus my current version.
Sigh. So, if glibc can't be downgraded, what's my next step?
Cheers,
EE |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Fri Apr 13, 2012 8:36 pm Post subject: |
|
|
ExecutorElassus,
I have sys-libs/glibc-2.14.1-r2 so I can post a tarball. It will be similar to what you would get from the tinderbox except its optimosed for an AMD Phenom II 1090.
I will have it for other 64 bit AMD arches too, like an E350 and whatever is in the HP Microserver.
If you prefer, I can tell you how to make you own packages. You will need 10G or so of space for this. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 8:42 pm Post subject: |
|
|
Hi Neddy,
I'm going to go with the tarball option, because 1) I'm not sure I have 10GB free space to build without moving things around, and 2) I'm not certain of my toolchain's integrity.
would you mind posting your version? I might very well have a similar CPU, but right now, I'm mainly just worried about getting to a point where I can build my own toolchain (which apparently will need working python, linux-headers, coreutils, glibc, gcc, and probably rebuilding the kernel for good measure).
Thanks for the help.
EE |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Fri Apr 13, 2012 8:54 pm Post subject: |
|
|
ExecutorElassus,
Heres my glibc-2.14.1-r2.
You don't use your toolchain to make your own packages.
Long story short ... make a ext2 fs in a file ... about 10G
Loopback mount the file on /mnt/gentoo. put a stage3 and portage snapshot in there
chroot into the new install in a file. Set FEATURES to include buildpkg, emerge --sync, emerge whatever you need.
From outside the chroot in a file copy the packages you want out of /mnt/gentoo/usr/portage/packages/...
Install them as anything else you fetch from the tinderbox.
rm the install in a file when you are done. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 8:59 pm Post subject: |
|
|
So, now:
Code: | # tar xpf glibc-2.14.1-r2.tbz2
tar: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by tar)
| busybox also fails with the same error.
So, LiveCD?
* sadpanda * |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Fri Apr 13, 2012 9:14 pm Post subject: |
|
|
ExecutorElassus,
Yes. liveCD. This is why you build busybox with the static USE flag. So you still have something when glibc gets trashed.
You cn no longer chroot into your install as bash won't run.
Its cd /mnt/gentoo tar ... _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 9:18 pm Post subject: |
|
|
Okay, just to be clear: since I have partitions or /usr, /var, etc, I should mount them before I start untarring things, yes? Am I going to have to recreate all the device nodes and VGs first? How close to "from scratch" do I have to get?
Cheers,
EE |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Fri Apr 13, 2012 9:27 pm Post subject: |
|
|
ExecutorElassus,
Thats correct - you want all the component parts of the tarball to go into the right places, so your filesystem needs to be assembled.
You will be safer with the command Code: | tar -xpf /path/to/tarball -C /mnt/gentoo | too as it does not depend on using the Present Working Dir. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 9:32 pm Post subject: |
|
|
Hi Neddy,
okay, so, I do this:
boot the LiveCD (in my case, SystemRescueCD 2.4.1), set up networking so I can ssh over from my laptop, and then … will the VGs already be mountable? Will I need to recreate all the device nodes and LVs? Or can I simply mount things that the CD will auto-detect?
ugh. I hate this. I'm sorry for all the trouble, and am really thankful you're walking me through this. I'll start the boot up with the liveCD, and get back to you.
Cheers,
EE |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Fri Apr 13, 2012 9:44 pm Post subject: |
|
|
ExecutorElassus,
SystemRescueCd will start your raid sets and activate your logical volums. You should just need to do the mounts. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 10:02 pm Post subject: |
|
|
Okay, I got all the RAID sets mounted okay. Now, this:
Code: | % tar -xpf /mnt/gentoo/glibc-2.14.1-r2.tbz2 -C /mnt/gentoo
tar: This does not look like a tar archive
bzip2: Compressed file ends unexpectedly;
perhaps it is corrupted? *Possible* reason follows.
bzip2: Inappropriate ioctl for device
Input file = (stdin), output file = (stdout)
It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.
You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.
tar: Child returned status 2
tar: Error is not recoverable: exiting now
| Why does your tarball hate my happiness?
Is this something I haven't properly configured?
EDIT: whoops. I kinda copied the file over before it was completely finished downloading. Okay, I untarred glibc, and it didn't seem to puke (except for the "unexpected EOF" you said I can safely ignore).
So, do the same with the rest of the toolchain? I have the following from tinderbox:
Code: | % ls *.tbz2
coreutils-8.7.tbz2 glibc-2.14.1-r2.tbz2 portage-2.1.10.41.tbz2
glibc-2.13-r4.tbz2 linux-headers-2.6.39.tbz2 python-3.1.4-r3.tbz2 |
Missing are linux-headers-3.3, and a more recent portage. If glibc is functional, can I reboot and go back to emerging things? Or should I tinderbox more of the toolchain first?
Cheers,
EE |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Fri Apr 13, 2012 10:12 pm Post subject: |
|
|
ExecutorElassus,
Nope, You can either attempt the chroot, or reboot to to test. Success with either means fixing glibc worked, since almost nothing works without glibc.
Then you test one file at a time and only replace what you need.
At this time of night, I would reboot or chroot, then run the bootstrap.sh script and see what happens.
Can you leave it building while you sleep?
"Unexprected EOF" ??? Extra Garbage at End Ignoed is safe to ignore. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Fri Apr 13, 2012 10:16 pm Post subject: |
|
|
I can chroot into the system. I'll try running the bootstrap.sh, and report back tomorrow.
Thanks again for all the help. People like you are why gentoo is awesome.
Cheers,
EE
UPDATE: trying to run the bootstrap script from chroot results in
Code: | # scripts/bootstrap.sh
realpath: no command specified
Try `realpath --help' for more information.
* Error: '' does not exist. Exiting.
| So I tried rebooting. It's stuck sitting on usb and firewire discovery, so I'm not really sure what it#s up to. If I can't get it to boot, I'll let you know. The last lines I see at startup are:
Code: | usb 5-1: new low-speed USB device number 2 using ohci_hcd
input: Logitech USB trackball as /devices/pci0000:00/0000:00:13.4/usb5/5-1/5-1:1.0/input/input4
generic-usb 0003:046D:C408.0003: input: USB HID v1.10 Mouse [Logitech USB Trackball] ib usb-0000:00:13.4-1/input0
firewire_core: giving up on config rom for node id ffc0 |
Any guess what's going on? Or should I just keep going with a reinstall? SystemRescueCD seems to get me into a state where I can run portage fairly quickly. Maybe I'll just do that. Or...?
UPDATE 2: Now I see what it was waiting on. Now I have a kernel panic: /dev/md3 is not recognized, and it's trying to find a boot sector on fd0, etc. Seems like the liveCD renamed my RAID arrays again.
From a booted system I get the same error about realpath as previously. So, should I just reinstall from the liveCD?
UPDATE 3: Well, although the bootstrap script won't work, I can emerge things in my toolchain. I'm working on coreutils now, after linux-headers and bash went in successfully. Should I just keep going with glibc, gentoolkit, gcc, etc, and manually emerge stuff until I have working system?
AFTER-HOURS UPDATE: So, now I'm hanging on emerging xorg-server:
Code: | Making all in dix
make[1]: Entering directory `/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0_build/dix'
make all-am
make[2]: Entering directory `/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0_build/dix'
CC atom.lo
CC colormap.lo
CC cursor.lo
CC devices.lo
CC dispatch.lo
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/atom.c: In function 'MakeAtom':
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/atom.c:134:7: warning: cast discards qualifiers from pointer target type
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/atom.c: In function 'FreeAtom':
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/atom.c:186:2: warning: cast discards qualifiers from pointer target type
CC dixfonts.lo
In file included from /var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/xkbsrv.h:55:0,
from /var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:66:
/usr/include/X11/extensions/XKBproto.h:491:1: error: expected identifier or '(' before '}' token
In file included from /var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:73:0:
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/dixevents.h:84:52: warning: redundant redeclaration of 'PostSyntheticMotion'
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/input.h:525:13: note: previous declaration of 'PostSyntheticMotion' was here
In file included from /var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:83:0:
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/Xi/exglobals.h:62:12: warning: redundant redeclaration of 'DeviceKeyPress'
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/xkbsrv.h:309:51: note: previous declaration of 'DeviceKeyPress' was here
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/Xi/exglobals.h:63:12: warning: redundant redeclaration of 'DeviceKeyRelease'
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/xkbsrv.h:309:66: note: previous declaration of 'DeviceKeyRelease' was here
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/Xi/exglobals.h:64:12: warning: redundant redeclaration of 'DeviceButtonPress'
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/xkbsrv.h:310:51: note: previous declaration of 'DeviceButtonPress' was here
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/Xi/exglobals.h:65:12: warning: redundant redeclaration of 'DeviceButtonRelease'
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/xkbsrv.h:310:69: note: previous declaration of 'DeviceButtonRelease' was here
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/Xi/exglobals.h:66:12: warning: redundant redeclaration of 'DeviceMotionNotify'
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/xkbsrv.h:309:83: note: previous declaration of 'DeviceMotionNotify' was here
In file included from /var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:87:0:
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/enterleave.h:42:13: warning: redundant redeclaration of 'DoFocusEvents'
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/dix.h:451:13: note: previous declaration of 'DoFocusEvents' was here
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/enterleave.h:87:13: warning: redundant redeclaration of 'DeviceFocusEvent'
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/include/exevents.h:184:1: note: previous declaration of 'DeviceFocusEvent' was here
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c: In function 'SendDevicePresenceEvent':
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:324:43: warning: declaration of 'type' shadows a global declaration
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c: In function 'FreeDeviceClass':
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:732:21: warning: declaration of 'type' shadows a global declaration
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c: In function 'FreeFeedbackClass':
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:798:23: warning: declaration of 'type' shadows a global declaration
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c: In function 'BadDeviceMap':
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:1637:30: warning: declaration of 'length' shadows a global declaration
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c: In function 'GetMaster':
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:2610:33: warning: declaration of 'which' shadows a global declaration
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c: In function 'AllocDevicePair':
/var/tmp/portage/x11-base/xorg-server-1.12.0-r1/work/xorg-server-1.12.0/dix/devices.c:2653:18: warning: declaration of 'pointer' shadows a global declaration
make[2]: *** [devices.lo] Error 1
make[2]: *** Waiting for unfinished jobs....
| etc etc. I'm assuming a header file or dependency is corrupt. Any guess which? |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Sat Apr 14, 2012 3:27 pm Post subject: |
|
|
Hi Neddy,
okay, I guess we're now past the "how do I get the drives working again?" stage, and on to "how do I remerge everything?" stage. Today's error I can't figure out is from e2fsprogs:
Code: | make[2]: Entering directory `/var/tmp/portage/sys-fs/e2fsprogs-1.42.1/work/e2fsprogs-1.42.1/debugfs'
MK_CMDS debug_cmds.c
CC util.c
CC debugfs.c
CC ncheck.c
CC icheck.c
make[2]: execvp: mk_cmds: Permission denied
make[2]: *** [debug_cmds.c] Error 127
make[2]: *** Waiting for unfinished jobs....
make[2]: Leaving directory `/var/tmp/portage/sys-fs/e2fsprogs-1.42.1/work/e2fsprogs-1.42.1/debugfs'
make[1]: *** [all-progs-recursive] Error 1
make[1]: Leaving directory `/var/tmp/portage/sys-fs/e2fsprogs-1.42.1/work/e2fsprogs-1.42.1'
make: *** [all] Error 2
emake failed
* ERROR: sys-fs/e2fsprogs-1.42.1 failed (compile phase):
| so, execvp is trying to write something, and failing on permissions. But what?
I've managed to get perl, gcc, glibc glib, cairo, python, portage, and a good chunk of @system to emerge okay. Now I'm stuck on this.
Any guess what it might be?
Cheers,
Andrew
UPDATE: okay, nevermind that. I figured out the offesnding package (e2fsprogs-libs), remerged it, and then resumed emerging @system. Assuming that works out, what next? I can't do a full @world emerge, because I have udev-182 blocked (and more things are starting to depend on it). So, kernel, then xorg, nvidia, and so forth? And what's going on with the qt packages? I got a whole pile of blocked packages when I tried to emerge qt. Is there a way to track down which program is forcing an old qt to be installed? |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Sat Apr 14, 2012 4:55 pm Post subject: |
|
|
Okay, maybe I still have drive problems.
I just rebooted, and - yet again - one of the members of md127 was put into its own array, and md127 itself was set inactive, with both of its member drives marked as spares. What's going on with that? I don't find any errors with dmesg, or with 'mdadm -E /dev/sdX4', so I'm not really sure why mdadm keeps dropping the drives out. Can you give any advice?
Thanks,
EE |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Sat Apr 14, 2012 5:49 pm Post subject: |
|
|
ExecutorElassus,
Look at dmesg and the event count on each member of your raid set.
If you raid set assembled in degraded mode (only n-1) drives, it would not rin unless your forced it to run.
You would remember doing Code: | mdadm --run /dev/md... |
If a drive dropped out later, it would be in dmesg.
On a healthy raid tthe event count is identical on all members. If you can find n-1 drived with an identical event count, its probably OK to assemeble the raid with only those drives, then run it manually.
IF you still have hardware issues there is no point in doing any more rebuilding of software. It will just break again. You can teake the suspect drive out of the array and run it in degraded mode for a while.
Its probably worth trying to read the drive to /dev/null and watching dmesg for errors
Code: | dd if=/dev/sdX of=/dev/null bs=4096000 | The large bs= (about 4Mb) speeds up the process. It will be several hours.
read to see how to get a progress report from dd
edit: looking back fixing a broken glibc is one of the hardest gentoo fixes. Thats behind you now _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Sat Apr 14, 2012 6:01 pm Post subject: |
|
|
Hi Neddy,
the two drives that were in an inactive array - and marked as spares - had the same event count. The one that got dropped out had six fewer.
So, I'll try running dd on the array, once it's built in about six hours. Is it possible that the wonky role numbers for the drives (sda4[0] sdc4[3] sdb4[2], whereas the other two arrays are respectively [0] [1] [2]) is causing mdadm to assume that a a drive in between is missing, and that sda4 (which was not in the array at startup) did not belong (as sdc4 had a role number of [3], already beyond the drive count)? Is there any way to fix that on the fly?
Cheers,
EE |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54242 Location: 56N 3W
|
Posted: Sat Apr 14, 2012 6:02 pm Post subject: |
|
|
ExecutorElassus,
Put Code: | >=sys-fs/udev-180
>=sys-fs/udev-init-scripts-10
>=sys-auth/consolekit-0.4.5-r3
>=sys-apps/openrc-0.9.9.3
=sys-apps/net-tools-1.60_p20120127084908
| into /etc/portage/package.mask to keep udev at bay meanwhile. You may find you need to add other things too. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ExecutorElassus Veteran
Joined: 11 Mar 2004 Posts: 1435 Location: Berlin, Germany
|
Posted: Sat Apr 14, 2012 6:30 pm Post subject: |
|
|
Hi Neddy,
I'll add in those packages to package.mask once I can boot up with RAID working (right now, I don't even have access to nano, much less the files I need to edit). The only thing I can see at the end of dmesg is "mdadm: sending ioctl 1261 to a partition!" which another forum told me was a kernel error I can ignore.
If there's nothing in dmesg, can you think of any reason my RAID would be dropping drives out of the array? It's a different dive from the one that was suspect last time; I'd find it really hard to believe that two drives out of three failed.
Cheers,
EE |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|