View previous topic :: View next topic |
Author |
Message |
Ace_of_Spades n00b
Joined: 06 Feb 2003 Posts: 15 Location: Eichgraben, AUSTRIA
|
Posted: Thu Mar 06, 2003 9:23 am Post subject: ext2, ext3, xfs corruption - hardware related??? |
|
|
i'v a brand new system here:
* MSI K7D Master-L (BIOS v1.5)
* 2x AMD Athlon MP 2000+
* 2x 512MB hynix PC2100 CL2,5 ECC
* 2x Promise Ultra 133 TX2 (PDC20269 with latest BIOS)
* 4x Maxtor 6E030L0 Ultra DMA 133 7200 RPM - 30GB
* Enermax PSU EG-651P-VE (550W)
* Nvidia Riva-TNT2 M64 32MB
I'v had no problems installing Gentoo 1.2 and 1.4_rc3 on LVM on software raid5 (3 raid disks, 1 spare). But when I create an ext2 or ext3 filesystem larger than 30GB e2fsck -f finds 10000s of errors without having the filesystem mounted at all.
BTW xfs behavior is also strange. Creating large files (>1GB) of random numbers and checking them with md5sum causes the system to fail. After that many executable files are not found even if they are on other partitions. After a reboot (power off - reboot never was found after that) and a raid-resync everything is fine again - same with ext2, ext3.
Onother command to kill the system in the above mentioned way is 'tar -cWvplf' for large paths like portage or src.
What I did to find the error:
* HD check with Powermax from Maxtor
* changed UDMA 133 cables
* tried with only one DIMM
* change CPUs
* tried with each CPU alone
* tried another UDMA controller (CMD649) and onboard IDE controller
instead of Promise controllers
* tried with UDMA5 (-X69) instead of UDMA6
* tried kernels: gentoo, 2.4.21pre5-ac1, redhat, 2.4.19 ....
Now memtest86 is running since 12 hours on extended tests with no error messages.
I do not know what to try next!!!
Could anybody give me a hint please |
|
Back to top |
|
|
bLanark Apprentice
Joined: 27 Aug 2002 Posts: 181 Location: Royal Berkshire, UK
|
Posted: Thu Mar 06, 2003 10:19 am Post subject: Probably hardware |
|
|
I suspect hardware here.
Is there any chance that you can take the drives out and put them in a windows machine for some non-destructive testing using the utilites on the IBM or Maxtor site? Or if your BIOS has a S.M.A.R.T. test option (possible on the CD that came with it?) then run that.
I had a laptop HDD go and the symptoms were the same - repeated fscks for no reason. I had a server IDE drive go but the symptoms were entirely different - any process accessing the drive would just hang.
Of course, it might not be hardware. Can you remove LVM from the equation and create partitions on each drive in turn?
BTW, I use LVM with ReiserFS with a partition greater than 100G without problems. _________________ .sig: access denied |
|
Back to top |
|
|
Ace_of_Spades n00b
Joined: 06 Feb 2003 Posts: 15 Location: Eichgraben, AUSTRIA
|
Posted: Thu Mar 06, 2003 10:23 am Post subject: |
|
|
I did all the test of Powermax from Maxtor's website (low level format inclusive) - no errors!! |
|
Back to top |
|
|
bLanark Apprentice
Joined: 27 Aug 2002 Posts: 181 Location: Royal Berkshire, UK
|
Posted: Thu Mar 06, 2003 10:57 am Post subject: Sorry |
|
|
Quote: | I did all the test of Powermax from Maxtor's website (low level format inclusive) - no errors!! |
Hmm, yes, missed that at first.
OK, the drives are 30 gb and the errors are only when the partitions are bigger than 30 gigs - right?
Anything in the syslog? Can you turn up the logging level for the LVM stuff? I can't see how to, but it must be possible, certainly most of the utilities (e.g. vgdisplay) have a -v or -vv option.
You might want to try evms instead of lvm, if you're at a stage where you can afford to start again.
Ally _________________ .sig: access denied |
|
Back to top |
|
|
Ace_of_Spades n00b
Joined: 06 Feb 2003 Posts: 15 Location: Eichgraben, AUSTRIA
|
Posted: Thu Mar 06, 2003 11:27 am Post subject: |
|
|
My first try on installing gentoo 1.2 was directly on raid1 and raid5 partitions. So I think LVM isn't the bad boy.
On partitions up to 25GB (i don't know the exact limit) e2fsck -f works without problems after creating the filesystem. But the strange errors described above (tar, md5sum) occure on that partitions too.
On partitions of about 50GB (raid5) e2fsck -f finds 10000s of errors like:
"Inode XXX is in use, but has dtime set", "Inode XXX has compression flag set on filesystem without compression support", "Inode XXX has illegal block(s)", "Too many illegal blocks in inode XXX"
even if the filesystem wasn't mounted anyway.
Cannot try anything at the moment becouse memtest86 v3.0 is running (test11) and I'm looking foreward to the results --> probably no errors I bet! |
|
Back to top |
|
|
taskara Advocate
Joined: 10 Apr 2002 Posts: 3763 Location: Australia
|
Posted: Thu Mar 06, 2003 11:28 am Post subject: |
|
|
hardware.. or perhaps KERNEL _________________ Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer! |
|
Back to top |
|
|
Ace_of_Spades n00b
Joined: 06 Feb 2003 Posts: 15 Location: Eichgraben, AUSTRIA
|
Posted: Thu Mar 06, 2003 11:40 am Post subject: |
|
|
thx a lot taskara for your v e r y helpfull posting!
what kind of hardware
do you have a kernelconfig for me - or what do you mean by KERNEL |
|
Back to top |
|
|
bLanark Apprentice
Joined: 27 Aug 2002 Posts: 181 Location: Royal Berkshire, UK
|
Posted: Thu Mar 06, 2003 2:31 pm Post subject: |
|
|
OK, for what it's worth, I've got this version of LVM, and this kernel:
Code: |
altair root # vgcreate --version
vgcreate: Logical Volume Manager 1.0.5
Heinz Mauelshagen, Sistina Software 15/07/2002 (IOP 10)
|
Code: |
altair root # uname -a
Linux altair 2.4.19-gentoo-r9
|
I am NOT currently spanning more than one HDD, I have a single 120 gb drive, one volume group and one physical volume - if I remember the terminology correctly
The other drive I was using is dodgy and is back with the vendor for "testing". I am using reiserFS, not xfs or ext2 or ext3.
I'm NOT using RAID either.
Sorry I can't be of more help.
Oh, this machine is under gentoo 1.2, not 1.4 (i.e.
Code: |
gcc version 2.95.3 20010315 (release)
|
_________________ .sig: access denied |
|
Back to top |
|
|
taskara Advocate
Joined: 10 Apr 2002 Posts: 3763 Location: Australia
|
Posted: Thu Mar 06, 2003 8:45 pm Post subject: |
|
|
Ace_of_Spades wrote: | thx a lot taskara for your v e r y helpfull posting!
what kind of hardware
do you have a kernelconfig for me - or what do you mean by KERNEL |
I'm just saying that maybe the kernel you are using is corrupting the filesystems.
try a vanilla kernel, or beta kernel..
is it happening when you INSTALL gentoo.. or AFTER you have mae your system and put in your own kernel ? _________________ Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer! |
|
Back to top |
|
|
Ace_of_Spades n00b
Joined: 06 Feb 2003 Posts: 15 Location: Eichgraben, AUSTRIA
|
Posted: Thu Mar 06, 2003 8:52 pm Post subject: |
|
|
thx, but as mentioned in my first post I tried nearly every 2.4 kernel on the market.
memtest-86 v3.0 has finished all test (extended included) without errors! |
|
Back to top |
|
|
taskara Advocate
Joined: 10 Apr 2002 Posts: 3763 Location: Australia
|
Posted: Fri Mar 07, 2003 12:03 am Post subject: |
|
|
try a new hard drive
try a different ide controller
upgrade your powersupply
get a shotgun _________________ Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer! |
|
Back to top |
|
|
Ace_of_Spades n00b
Joined: 06 Feb 2003 Posts: 15 Location: Eichgraben, AUSTRIA
|
Posted: Fri Mar 07, 2003 8:31 am Post subject: |
|
|
I did another install last night which seems to work without problems.
the following two parameters have been changed:
* partitiontables were made with cfdisk instead of fdisk
* 2 60cm (24") UDMA133 cables were replaced by 2 45cm (18") shielded ones
The system now runs gentoo 1.4_rc3 with 2.4.20-gentoo-r1 on ext3 on lvm on raid5 without any errors.
Special thanks to my little pink dancing elephant who put me back on the right way!
Finaly - can anybody explain to me which of the above mentioned changes did the trick (if cfdisk not only is a frontend to fdisk, could it be that fdisk is buggy?) |
|
Back to top |
|
|
taskara Advocate
Joined: 10 Apr 2002 Posts: 3763 Location: Australia
|
Posted: Fri Mar 07, 2003 12:00 pm Post subject: |
|
|
I would say it was the cables.. fdisk has nothing to do with your filesystem.. so it doesn't seem like the culprit.
long ide cables are notorious for problems.. expecially if they aren't shielded.
then again.. maybe there was a new package released since you did your first install.. who knows!? At least it works! _________________ Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer! |
|
Back to top |
|
|
klimg n00b
Joined: 21 Sep 2002 Posts: 55
|
Posted: Fri Mar 07, 2003 12:09 pm Post subject: |
|
|
I woldn't bet that you are out of the woods.I always get I/O errors from the hd after 3-4 month with ext2/3 filesystems on a samsung drive which does point to a defective drive.
The first time that happened I ran a hd test from Cerberus that is supposed to destroy defective hardware for a week flat.No problems.With reiserfs everything works fine. |
|
Back to top |
|
|
Ace_of_Spades n00b
Joined: 06 Feb 2003 Posts: 15 Location: Eichgraben, AUSTRIA
|
Posted: Fri Mar 07, 2003 12:18 pm Post subject: |
|
|
Quote: | I always get I/O errors from the hd after 3-4 month with ext2/3 filesystems |
You are right - e2fsck reported errors again, but the oher tests (tar; generating 4 files from /dev/random of 4GB size simultaniously and checking them with md5sum) work fine at the moment.
Think I will switch to xfs. |
|
Back to top |
|
|
taskara Advocate
Joined: 10 Apr 2002 Posts: 3763 Location: Australia
|
Posted: Fri Mar 07, 2003 1:01 pm Post subject: |
|
|
I went to reiserfs.. it's nice _________________ Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer! |
|
Back to top |
|
|
klimg n00b
Joined: 21 Sep 2002 Posts: 55
|
Posted: Fri Mar 07, 2003 3:54 pm Post subject: |
|
|
I am pretty sure in my case it's some issue with my cheapass everything onboard mobo (seen some stuff about I/O errors with sis chipset on the kernel list).But reiser never gave me any trouble. |
|
Back to top |
|
|
taskara Advocate
Joined: 10 Apr 2002 Posts: 3763 Location: Australia
|
Posted: Fri Mar 07, 2003 10:14 pm Post subject: |
|
|
OOOOHHHH you didn't mention you have an SIS chipset!!! lol _________________ Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer! |
|
Back to top |
|
|
dweigert Guru
Joined: 04 Oct 2002 Posts: 369 Location: Somerset, NJ USA
|
Posted: Sun Mar 09, 2003 7:24 pm Post subject: |
|
|
The MAXIMUM length for an ide cable is 18 inches. Otherwise you do get corruption.
Dan |
|
Back to top |
|
|
Ace_of_Spades n00b
Joined: 06 Feb 2003 Posts: 15 Location: Eichgraben, AUSTRIA
|
|
Back to top |
|
|
ben_h Tux's lil' helper
Joined: 26 Nov 2002 Posts: 118 Location: Australia
|
Posted: Sat Mar 15, 2003 2:45 am Post subject: |
|
|
Jeez that's interesting. Glad you've got it solved!
Although, I think the PS/2 issue was only half your problem -- the other half was probably the IDE cables being too long.
In any case, enjoy your (now stable) dual MP box |
|
Back to top |
|
|
|