View previous topic :: View next topic |
Author |
Message |
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
Posted: Tue Nov 23, 2010 5:08 am Post subject: SATA link timout after boot |
|
|
Code: |
[ 31.712373] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 31.712394] ata1.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[ 31.712397] res 40/00:0c:a8:a1:88/00:00:07:00:00/40 Emask 0x4 (timeout)
[ 31.712410] ata1: hard resetting link
[ 32.018049] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 32.019929] ata1.00: configured for UDMA/133
[ 32.019962] ata1: EH complete
|
I'm getting towards the end of my boot cycle. It's after the boot runlevel, and during the 'default' runlevel. According to bootchart the system is going through a period of huge disk utilization, but nothing seems to be loading.
Code: |
[ 0.519213] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 0.519876] ata1.00: ATA-7: INTEL SSDSA2M080G2GC, 2CV102HD, max UDMA/133
[ 0.519879] ata1.00: 156301488 sectors, multi 16: LBA48 NCQ (depth 31/32)
|
ata1 is my Intel x-25m SSD
Any ideas what's causing this timeout? |
|
Back to top |
|
|
idella4 Retired Dev
Joined: 09 Jun 2006 Posts: 1600 Location: Australia, Perth
|
Posted: Tue Nov 23, 2010 10:05 am Post subject: |
|
|
El_Presidente_Pufferfish,
whoa, Mr pufferfish??,
anyway, you're a biy light on with info. Post, ie. wgetpaste your config & I'll try it out. _________________ idella4@aus |
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
|
Back to top |
|
|
idella4 Retired Dev
Joined: 09 Jun 2006 Posts: 1600 Location: Australia, Perth
|
Posted: Tue Nov 23, 2010 4:36 pm Post subject: |
|
|
well Mr. pufferfish, it's early yet, just started compiling it now. I was taken aback when I found my first port if call. Your only entries for your drive are
Quote: |
│ │ [ ] Verbose ATA error reporting │ │
│ │ [*] ATA ACPI Support │ │
│ │ [ ] SATA Port Multiplier support │ │
│ │ *** Controllers with non-SFF native interface *** │ │
│ │ <*> AHCI SATA support │ │
|
First time I've seen a config not incorporating
│ │ [ ] ATA SFF support │ │
However, I'm not the gentoo admin guru, he's not here atm.
I suspect your config is underdone. However, I shall try it, though I guess I will have to add my hard drive additions to get it to boot.
Does it boot to a console or level 5 or just hang? you haven't made it clear. Is the outpit in your post from dmesg acquired from an external system?
I ran it with the addition only of my reiserfs. It booted to a read only state and predictably read only my sata drive which houses this system and a usb stick. Being read only, I didn't save the dmesg, but if it were really called for I could re-do it and boot into another system then acquire it to post. I'm not the hardware guru, but I'd suggest adding a selection or two under ATA SFF support , such as _________________ idella4@aus |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54244 Location: 56N 3W
|
Posted: Tue Nov 23, 2010 7:03 pm Post subject: |
|
|
idella4,
Code: | │ [ ] ATA SFF support │ | only disables everything in that sub menu. If you don't need anything there, e.g. because your hard drive chip set is AHCI and you have no PATA, then its fine. It wouldn't work for me.
El_Presidente_Pufferfish
Code: | 32.018049] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) | shows you have a SATA2 controller on the motherboard.
What drive is on the end of the data cable. If its only a SATA1 you may be having SATA fallback issues. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
Posted: Wed Nov 24, 2010 3:50 pm Post subject: |
|
|
@NeddySeagoon
Code: | [ 0.519876] ata1.00: ATA-7: INTEL SSDSA2M080G2GC, 2CV102HD, max UDMA/133 |
The drive is an Intel X-25M 80GB
@idella4
It continues to boot after pausing for approximately 20 seconds.
This drive holds my /boot and / partitions, so I find it very odd that it only hangs after it clears the boot runlevel. |
|
Back to top |
|
|
idella4 Retired Dev
Joined: 09 Jun 2006 Posts: 1600 Location: Australia, Perth
|
Posted: Wed Nov 24, 2010 4:10 pm Post subject: |
|
|
I didn't finish my last entry, got distracted.
Quote: |
│ │ [*] ATA SFF support │ │
│ │ < > ServerWorks Frodo / Apple K2 SATA support │ │
│ │ < > Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support │ │
│ │ < > ARTOP/Acard ATP867X PATA support │ │
│ │ < > ATI PATA support │ │
│ │ < > CMD640 PCI PATA support (Experimental) │ │
│
│ < > Intel PATA MPIIX support │ │
│ │ < > Intel PATA old PIIX support │ │
│
│ < > Winbond SL82C105 PATA support │ │
│ │ < > Intel SCH PATA support │ │
│ |
is what I intended to add. Neddy is the master, he'll be back later on. So you can boot ok, but it baulks.
Neddy will point straight away at the merits of these choices. If it were me, I'd be trying them through a rotation and experiment seeking a fix. _________________ idella4@aus |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54244 Location: 56N 3W
|
Posted: Wed Nov 24, 2010 5:31 pm Post subject: |
|
|
El_Presidente_Pufferfish,
Its SATA2 both ends - which is good. It crossed my mind that you were having problems because SATA speed autonegiotation is often broken between a controller that is faster then the drive. Its good to have ruled that out.
I'm out of ideas meanwhile. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
|
Back to top |
|
|
idella4 Retired Dev
Joined: 09 Jun 2006 Posts: 1600 Location: Australia, Perth
|
Posted: Thu Nov 25, 2010 5:04 am Post subject: |
|
|
I did a google search on parts of the error message such as, and there are a few references to your drive model.
this one for example. takem from your opening post. This bug submission supplies a handy hint that suits you & me. I also have a drive, an old ide, playing up something awful and am just getting to figure out how to test & diagnose it.
Set smartctl onto it. The tip that is good fo us both is smartd. Run smartd -d, & you too may get a useful output implicating a drive hardware state. I think it's your drive. _________________ idella4@aus |
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
Posted: Thu Nov 25, 2010 5:12 am Post subject: |
|
|
smartd -d doesn't show anything exciting
Code: | # smartd -d
smartd 5.40 2010-10-16 r3189 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
Opened configuration file /etc/smartd.conf
Drive: DEVICESCAN, implied '-a' Directive on line 23 of file /etc/smartd.conf
Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
glob(3) found no matches for pattern /dev/hd[a-t]
glob(3) found no matches for pattern /dev/sd[a-c][a-z]
Device: /dev/sda, type changed from 'scsi' to 'sat'
Device: /dev/sda [SAT], opened
Device: /dev/sda [SAT], found in smartd database.
Device: /dev/sda [SAT], can't monitor Current Pending Sector count - no Attribute 197
Device: /dev/sda [SAT], can't monitor Offline Uncorrectable Sector count - no Attribute 198
Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.
Device: /dev/sdb, type changed from 'scsi' to 'sat'
Device: /dev/sdb [SAT], opened
Device: /dev/sdb [SAT], found in smartd database.
Device: /dev/sdb [SAT], is SMART capable. Adding to "monitor" list.
Device: /dev/sdc, type changed from 'scsi' to 'sat'
Device: /dev/sdc [SAT], opened
Device: /dev/sdc [SAT], found in smartd database.
Device: /dev/sdc [SAT], is SMART capable. Adding to "monitor" list.
Monitoring 3 ATA and 0 SCSI devices
Device: /dev/sda [SAT], opened ATA device
Device: /dev/sdb [SAT], opened ATA device
Device: /dev/sdc [SAT], opened ATA device
|
smartctrl -a /dev/sda also doesn't seem out of the ordinary
http://paste.pocoo.org/show/295775/ |
|
Back to top |
|
|
idella4 Retired Dev
Joined: 09 Jun 2006 Posts: 1600 Location: Australia, Perth
|
Posted: Thu Nov 25, 2010 5:20 am Post subject: |
|
|
El_Presidente_Pufferfish,
right, that rules that out. Change of approach, it should be in your kernel settings.
How about, save oyur config (oops already in boot). got to the kernel, invoke make defconfig, try its result. If no better, I'll post you my healthy 2.6.36 config, you can adapt that to your pc and test. If I were Neddy, I could single out a single config entry, so I take this shotgun approach. oops Neddy said he was stuck!!! _________________ idella4@aus |
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
Posted: Thu Nov 25, 2010 5:28 am Post subject: |
|
|
I added verbose ATA errors to my kernel config, and now I see:
Code: |
[ 26.720356] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 26.720368] ata1.00: failed command: SMART
[ 26.720383] ata1.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[ 26.720386] res 40/00:0c:88:76:46/00:00:03:00:00/40 Emask 0x4 (timeout)
[ 26.720393] ata1.00: status: { DRDY }
[ 26.720404] ata1: hard resetting link
[ 27.027039] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 27.029142] ata1.00: configured for UDMA/133
[ 27.029179] ata1: EH complete
|
from dmesg
Full dmesg: http://paste.pocoo.org/show/295781/ |
|
Back to top |
|
|
idella4 Retired Dev
Joined: 09 Jun 2006 Posts: 1600 Location: Australia, Perth
|
Posted: Thu Nov 25, 2010 5:39 am Post subject: |
|
|
El_Presidente_Pufferfish
well, ok, that adds to the error log. I suggest to add this previously cited intel PIIX entry to boost your intel kernel drivers. I'm fairly sure your kernel config is just lacking some key entries.
Quote: |
│ │ [*] ATA SFF support │ │
│ │ < > ServerWorks Frodo / Apple K2 SATA support │ │
│ │ < > Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support │ │
|
_________________ idella4@aus |
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
Posted: Thu Nov 25, 2010 5:41 am Post subject: |
|
|
It's an AMD system with an nvidia chipset. I don't understand why I would need Intel chipset drivers.
The HDD is Intel, not the chipset |
|
Back to top |
|
|
idella4 Retired Dev
Joined: 09 Jun 2006 Posts: 1600 Location: Australia, Perth
|
Posted: Thu Nov 25, 2010 5:53 am Post subject: |
|
|
ooops, try again.
Mine is nvidia with an intel cpu.
Quote: |
.config - Linux Kernel v2.6.34-zen1 "Back in the Saddle" Configuration
────────────────────────────────────────────────────────────────────────────────────────────────
┌─────────────────────────── Serial ATA and Parallel ATA drivers ───────────────────────────┐
│ Arrow keys navigate the menu. <Enter> selects submenus --->. Highlighted letters are │
│ hotkeys. Pressing <Y> includes, <N> excludes, <M> modularizes features. Press │
│ <Esc><Esc> to exit, <?> for Help, </> for Search. Legend: [*] built-in [ ] excluded │
│ <M> module < > module capable │
│ ┌───────────────────────────────────────────────────────────────────────────────────────┐ │
│ │ --- Serial ATA and Parallel ATA drivers │ │
│ │ [*] Verbose ATA error reporting │ │
│ │ [*] ATA ACPI Support │ │
│ │ [*] SATA Port Multiplier support │ │
│ │ <*> AHCI SATA support │ │
│ │ < > Silicon Image 3124/3132 SATA support │ │
│ │ [*] ATA SFF support │ │
│ │ < > ServerWorks Frodo / Apple K2 SATA support │ │
│ │ < > Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support │ │
│ │ < > Marvell SATA support │ │
│ │ <*> NVIDIA SATA support │ │
│ │ < > Pacific Digital ADMA support │ │
│ │ <*> AMD/NVidia PATA support │ │
│ │ < > ARTOP 6210/6260 PATA support │ │
│ │ < > Nat Semi NS87415 PATA support │ │
|
try that. I must admit I should experiment with this more to pin it down.
Quote: |
│ │ < > Intel PATA MPIIX support │ │
|
I'm fairly sure this addresses the hard drive, not the chipset.
│ │ <*> NVIDIA SATA support │ │
addresses the chipset. From other posts, I gather MPIIX is a bread & butter intel drive driver. Try it out, re-post.
That aside, have you tried make defconfig and observed its selection? _________________ idella4@aus |
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
Posted: Thu Nov 25, 2010 4:32 pm Post subject: |
|
|
If I enable both AHCI SATA Support and NVIDIA SATA Support there is no change in behavior. As far as I can tell the NVIDIA support never loads.
The system doesn't boot if I choose NVIDIA SATA Support and deselect AHCI SATA Support. |
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
|
Back to top |
|
|
idella4 Retired Dev
Joined: 09 Jun 2006 Posts: 1600 Location: Australia, Perth
|
Posted: Thu Nov 25, 2010 5:39 pm Post subject: |
|
|
El_Presidente_Pufferfish
right. On the chance your kernel drive config is not optimal, could you or have run make defconfig, and describe the settings for your drive.
Your citing of
Code: |
[ 31.712373] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 31.712394] ata1.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[ 31.712397] res 40/00:0c:a8:a1:88/00:00:07:00:00/40 Emask 0x4 (timeout)
[ 31.712410] ata1: hard resetting link
[ 32.018049] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 32.019929] ata1.00: configured for UDMA/133 |
should represent the heart of the flaw. My suggestions may not be on target, however, those like MPIIX are worth a try, they won't hurt and can only themselves out by making no difference like the tried NVIDIA SATA Support.
Going to run your kernel config again, want to check something. _________________ idella4@aus |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54244 Location: 56N 3W
|
Posted: Thu Nov 25, 2010 6:37 pm Post subject: |
|
|
El_Presidente_Pufferfish,
You have a lot of I/O wait there. Please post your entire dmesg, from time zero.
Also you might like to try kernel gentoo-sources-2.6.36-r2 (or later) as it has some patches that are supposed to address that sort of thing. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
|
Back to top |
|
|
El_Presidente_Pufferfish Veteran
Joined: 11 Jul 2002 Posts: 1179 Location: Seattle
|
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54244 Location: 56N 3W
|
Posted: Thu Nov 25, 2010 7:12 pm Post subject: |
|
|
El_Presidente_Pufferfish,
Sorry about the kernel wild goose chase. The patches are only for AMD64 and you are running x86.
Code: | [ 6.167732] forcedeth 0000:00:0a.0: irq 41 for MSI/MSI-X
[ 26.720360] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen |
This 20 second gap is interesting.
Its probably wired networking trying to use DHCP and failing - that would be about the time out interval for DHCP.
Code: | [ 26.720372] ata1.00: failed command: SMART | Does your SSD support SMART?
Its probably not useful on a SSD as most of the data it returns relates to mechanical problems with the drive and SSDs don't have mechanical problems. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
idella4 Retired Dev
Joined: 09 Jun 2006 Posts: 1600 Location: Australia, Perth
|
Posted: Thu Nov 25, 2010 7:35 pm Post subject: |
|
|
welcome Neddy.
ok, I have reloaded your config into my 2.6.36-gentoo. I have added firstly nvidiafb which didn't want to go in, so I emerged nvidia-drivers to get into WM. I have only added a few graphical type settings to get this far.
I have access only to my main drive and a usb stick. So it's as close as can be to your initial config which faulters.
We both have CONFIG_X86_32 but different cpu processors & nvidia chipset.
El_Presidente_Pufferfish, whaere & how did you notice the extra I/O wait? by just observing a boot chart or by the boot noticeably baulking while you sat in front of the screen?
Code: |
(none) bin # uname -a
Linux (none) 2.6.36-gentoo-r1 #10 SMP Fri Nov 26 03:02:22 WST 2010 i686 Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz GenuineIntel GNU/Linux
(none) bin # ls /dev/sd*
/dev/sda /dev/sda10 /dev/sda2 /dev/sda4 /dev/sda6 /dev/sda8 /dev/sdb
/dev/sda1 /dev/sda11 /dev/sda3 /dev/sda5 /dev/sda7 /dev/sda9 /dev/sdb1
idella@(none) ~/bin $ sudo -s grep frozen /var/log/messages
Nov 22 18:40:06 genny kernel: [ 101.704082] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:42:51 genny kernel: [ 266.720081] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:56:33 genny kernel: [ 243.808026] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:57:38 genny kernel: [ 308.320031] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:58:08 genny kernel: [ 339.040029] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:58:39 genny kernel: [ 370.016146] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
|
Code: |
(none) bin # grep timeout /var/log/messages
................
Nov 24 00:09:48 genny kernel: [ 3.710237] Testing event scsi_dispatch_cmd_timeout: OK
Nov 24 00:09:48 genny kernel: [ 4.166824] Testing event scsi_dispatch_cmd_timeout: OK
(none) bin # grep timeout /var/log/dmesg
|
Code: |
/var/log/messages:Nov 25 12:49:20 genny kernel: [ 8731.539684] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
/var/log/messages:Nov 25 12:51:47 genny kernel: [ 8878.947132] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x
(none) bin # grep exception Emask 0x0 SAct /var/log/dmesg
grep: Emask: No such file or directory
grep: 0x0: No such file or directory
grep: SAct: No such file or directory
/var/log/dmesg:[ 0.000000] Enabling unmasked SIMD FPU exception support... done.
(none) bin # grep exception /var/log/dmesg
[ 0.000000] Enabling unmasked SIMD FPU exception support... done.
(none) bin # grep Emask /var/log/dmesg
(none) bin # grep SAct /var/log/dmesg
|
No sign of the log entries that you are experiencing. The occurrences in messages date back before today (26 Nov)
dmesg has no mention of timeout, frozen, Emask or SAct.
Now Let's let Neddy put all this together. I think it's , let's see
NeddySeagoon wrote: |
Its probably wired networking trying to use DHCP and failing - that would be about the time out interval for DHCP.
|
Neddy, hardwired and dhcp here. No delay that I could discern. Worth posting the dmesg from here? _________________ idella4@aus |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54244 Location: 56N 3W
|
Posted: Thu Nov 25, 2010 8:30 pm Post subject: |
|
|
idella4,
Its all in http://paste.pocoo.org/show/295985/ _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|