View previous topic :: View next topic |
Author |
Message |
PietdeBoer Apprentice
Joined: 20 Oct 2005 Posts: 244 Location: Eindhoven, the Netherlands
|
Posted: Thu Feb 14, 2008 3:17 pm Post subject: hdd errors, faulty cable? [SOLVED] |
|
|
Hey guys
my system crashes like ones a week. giving messages this is NOT a software error...
when the system is running i get this in my dmesg:
Code: |
0x400
ata5: CPB 0: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 1: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 2: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 3: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 4: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 5: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 6: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 7: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 8: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 9: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 10: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 11: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 12: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 13: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 14: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 15: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 16: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 17: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 18: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 19: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 20: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 21: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 22: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 23: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 24: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 25: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 26: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 27: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 28: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 29: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 30: ctl_flags 0x1f, resp_flags 0x1
ata5: Resetting port
ata5.00: exception Emask 0x10 SAct 0x3 SErr 0x19d0000 action 0x2 frozen
ata5.00: cmd 60/80:00:3f:01:00/00:00:00:00:00/40 tag 0 cdb 0x0 data 65536 in
res 40/00:08:3f:02:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
ata5.00: cmd 60/80:08:3f:02:00/00:00:00:00:00/40 tag 1 cdb 0x0 data 65536 in
res 40/00:08:3f:02:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
ata8: Hotplug event, freezing
ata8: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 0x500
ata8: CPB 0: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 1: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 2: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 3: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 4: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 5: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 6: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 7: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 8: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 9: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 10: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 11: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 12: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 13: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 14: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 15: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 16: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 17: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 18: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 19: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 20: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 21: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 22: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 23: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 24: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 25: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 26: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 27: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 28: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 29: ctl_flags 0x1f, resp_flags 0x1
ata8: CPB 30: ctl_flags 0x1f, resp_flags 0x1
ata8: Resetting port
ata8.00: exception Emask 0x10 SAct 0x1 SErr 0x19d0000 action 0x2 frozen
ata8.00: cmd 60/00:00:3f:01:00/01:00:00:00:00/40 tag 0 cdb 0x0 data 131072 in
res 40/00:00:3f:01:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
ata5: soft resetting port
ata8: soft resetting port
ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata5.00: configured for UDMA/133
ata5: EH complete
ata8.00: configured for UDMA/133
ata8: EH complete
SCSI device sdf: 976773168 512-byte hdwr sectors (500108 MB)
sdf: Write Protect is off
sdf: Mode Sense: 00 3a 00 00
SCSI device sdf: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
SCSI device sdi: 976773168 512-byte hdwr sectors (500108 MB)
sdi: Write Protect is off
sdi: Mode Sense: 00 3a 00 00
SCSI device sdi: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata5.00: exception Emask 0x10 SAct 0x0 SErr 0x19d0000 action 0x2 frozen
ata5.00: cmd 40/00:01:00:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 0
res 50/00:00:00:00:00/00:01:01:00:00/e0 Emask 0x10 (ATA bus error)
ata5: soft resetting port
ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0x80 on port 0xFFFFC2000003C49C
ATA: abnormal status 0x80 on port 0xFFFFC2000003C49C
ATA: abnormal status 0x80 on port 0xFFFFC2000003C49C
ATA: abnormal status 0x80 on port 0xFFFFC2000003C49C
ATA: abnormal status 0x80 on port 0xFFFFC2000003C49C
ata5.00: revalidation failed (errno=-2)
ata5: failed to recover some devices, retrying in 5 secs
ata5: hard resetting port
ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata5.00: configured for UDMA/133
ata5: EH complete
SCSI device sdf: 976773168 512-byte hdwr sectors (500108 MB)
sdf: Write Protect is off
sdf: Mode Sense: 00 3a 00 00
SCSI device sdf: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata8.00: exception Emask 0x10 SAct 0x0 SErr 0x19d0000 action 0x2 frozen
ata8.00: cmd 40/00:01:00:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 0
res 50/00:00:00:00:00/00:01:01:00:00/e0 Emask 0x10 (ATA bus error)
ata8: soft resetting port
ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0x80 on port 0xFFFFC2000003E59C
ATA: abnormal status 0x80 on port 0xFFFFC2000003E59C
ATA: abnormal status 0x80 on port 0xFFFFC2000003E59C
ATA: abnormal status 0x80 on port 0xFFFFC2000003E59C
ATA: abnormal status 0x80 on port 0xFFFFC2000003E59C
ata8.00: revalidation failed (errno=-2)
ata8: failed to recover some devices, retrying in 5 secs
ata8: hard resetting port
ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata8.00: configured for UDMA/133
ata8: EH complete
SCSI device sdi: 976773168 512-byte hdwr sectors (500108 MB)
sdi: Write Protect is off
sdi: Mode Sense: 00 3a 00 00
SCSI device sdi: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
|
the crashes started occuring when i added 4 new sata disks
output from seatools (all disks are seagate)
Code: |
fileserver ~ # ./st -l
Drive information:
/dev/sg0 FUJITSU MAX3036NP HPF1 71132959 blocks
/dev/sg1 ATA ST3400620NS 3.AE 781422767 blocks
/dev/sg2 ATA ST3400620NS 3.AE 781422767 blocks
/dev/sg3 ATA ST3400620NS 3.AE 781422767 blocks
/dev/sg4 ATA ST3400620NS 3.AE 781422767 blocks
/dev/sg5 ATA ST3500630AS 3.AA 976773167 blocks
/dev/sg6 ATA ST3500630AS 3.AA 976773167 blocks
/dev/sg7 ATA ST3500630AS 3.AA 976773167 blocks
/dev/sg8 ATA ST3500630AS 3.AA 976773167 blocks
|
where ATA5,6,7, and 8 are the new disks
since it only gives errors on ata5 and ata8.. i suggest there's something wrong with the connectors on the motherboard or the sata cables?
anyone has a good idea where this is coming from..
cheers! _________________ _ Got Root? _
Last edited by PietdeBoer on Fri Feb 15, 2008 7:27 pm; edited 1 time in total |
|
Back to top |
|
|
alex.blackbit Advocate
Joined: 26 Jul 2005 Posts: 2397
|
Posted: Thu Feb 14, 2008 3:59 pm Post subject: |
|
|
since you get errors on more than one drive at a time, i guess it is not a cable.
could it be a power problem? it seems you have a lot devices in your system. maybe the psu is at the limit? |
|
Back to top |
|
|
sageman Guru
Joined: 04 May 2005 Posts: 363 Location: New Hampshire
|
Posted: Thu Feb 14, 2008 9:11 pm Post subject: |
|
|
alex.blackbit wrote: | since you get errors on more than one drive at a time, i guess it is not a cable.
could it be a power problem? it seems you have a lot devices in your system. maybe the psu is at the limit? |
That's the first thing that jumped to my mind. What sort of wattage is your PSU? _________________ Carlton Stedman
Gentoo Metalheads on Last.fm: http://www.last.fm/group/Gentoo+Metalheads |
|
Back to top |
|
|
Cyker Veteran
Joined: 15 Jun 2006 Posts: 1746
|
Posted: Thu Feb 14, 2008 11:10 pm Post subject: |
|
|
It might also be them going into some sort of re-cal mode, or forcing a quick SMART check.
It turns out this is one of the main differences between consumer drives and 'enterprise' grade drives (Where they charge 50% more for what is really the same drive...).
I get these every now and then on my RAID array, as the disks are always active unlike the non-RAID'd drives on my system, but it only happens after a month or so of continuous uptime.
So far, the kernel's just reset the drive and then everything's carried on as normal. I've not even had to re-add it to the RAID array! |
|
Back to top |
|
|
PietdeBoer Apprentice
Joined: 20 Oct 2005 Posts: 244 Location: Eindhoven, the Netherlands
|
Posted: Fri Feb 15, 2008 11:23 am Post subject: |
|
|
it runs on a 450Watt zalman psu
further specs of the system are
amd 3200+ 1GB memory and like 7*9mm fans
could the psu be at limit with this amount of drives? _________________ _ Got Root? _ |
|
Back to top |
|
|
fangorn Veteran
Joined: 31 Jul 2004 Posts: 1886
|
Posted: Fri Feb 15, 2008 1:00 pm Post subject: |
|
|
From the sum of Watts used, I'd say no. But as all the drives share one Power level, I'd say quite possible. If you have another PSU at hand, try with open case powering part of the drives with another PSU. _________________ Video Encoding scripts collection | Project page |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54244 Location: 56N 3W
|
Posted: Fri Feb 15, 2008 1:10 pm Post subject: |
|
|
PietdeBoer,
The "450W Zalman" is not very useful. You will get power problems whenever a single output voltage gets near its limit.
There are also power combination rules that may mean you are getting power problems before you get anywhere near the 450W maximum PSU output.
e.g. your drives will each want about 8w to spin the motors (from the +12V) and 4w (from the +5v) to run the rest of the electronics.
Head movements will want pulses from the +12v.
Your CPU is probably operated from the +12v too. Investigate the load on the +12v and the PSUs capability to supply it _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
PietdeBoer Apprentice
Joined: 20 Oct 2005 Posts: 244 Location: Eindhoven, the Netherlands
|
Posted: Fri Feb 15, 2008 7:27 pm Post subject: |
|
|
I ordered the power cables a bit... making sure there aint to many hdds on one cable..
the server now ran for 1,5 hours, i've done some heavy copy work to test if it remains stable.. and it does.. my error messages dissapeared from dmesg
thx for your help guys, the solution was an overloaded powercable _________________ _ Got Root? _ |
|
Back to top |
|
|
alex.blackbit Advocate
Joined: 26 Jul 2005 Posts: 2397
|
Posted: Sat Feb 16, 2008 12:37 am Post subject: |
|
|
good to hear.
it does not happen to often that 4 gentooers have the same opinon AND that it's right.
have a nice day. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|