Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
SATA link timout after boot
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Tue Nov 23, 2010 5:08 am    Post subject: SATA link timout after boot Reply with quote

Code:

[   31.712373] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[   31.712394] ata1.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[   31.712397]          res 40/00:0c:a8:a1:88/00:00:07:00:00/40 Emask 0x4 (timeout)
[   31.712410] ata1: hard resetting link
[   32.018049] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   32.019929] ata1.00: configured for UDMA/133
[   32.019962] ata1: EH complete


I'm getting towards the end of my boot cycle. It's after the boot runlevel, and during the 'default' runlevel. According to bootchart the system is going through a period of huge disk utilization, but nothing seems to be loading.

Code:

[    0.519213] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    0.519876] ata1.00: ATA-7: INTEL SSDSA2M080G2GC, 2CV102HD, max UDMA/133
[    0.519879] ata1.00: 156301488 sectors, multi 16: LBA48 NCQ (depth 31/32)


ata1 is my Intel x-25m SSD

Any ideas what's causing this timeout?
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Tue Nov 23, 2010 10:05 am    Post subject: Reply with quote

El_Presidente_Pufferfish,

whoa, Mr pufferfish??,
anyway, you're a biy light on with info. Post, ie. wgetpaste your config & I'll try it out.
_________________
idella4@aus
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Tue Nov 23, 2010 3:50 pm    Post subject: Reply with quote

http://paste.pocoo.org/show/295037/
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Tue Nov 23, 2010 4:36 pm    Post subject: Reply with quote

well Mr. pufferfish, it's early yet, just started compiling it now. I was taken aback when I found my first port if call. Your only entries for your drive are
Quote:

│ │ [ ] Verbose ATA error reporting │ │
│ │ [*] ATA ACPI Support │ │
│ │ [ ] SATA Port Multiplier support │ │
│ │ *** Controllers with non-SFF native interface *** │ │
│ │ <*> AHCI SATA support │ │

First time I've seen a config not incorporating
│ │ [ ] ATA SFF support │ │

However, I'm not the gentoo admin guru, he's not here atm.
I suspect your config is underdone. However, I shall try it, though I guess I will have to add my hard drive additions to get it to boot.

Does it boot to a console or level 5 or just hang? you haven't made it clear. Is the outpit in your post from dmesg acquired from an external system?

I ran it with the addition only of my reiserfs. It booted to a read only state and predictably read only my sata drive which houses this system and a usb stick. Being read only, I didn't save the dmesg, but if it were really called for I could re-do it and boot into another system then acquire it to post. I'm not the hardware guru, but I'd suggest adding a selection or two under ATA SFF support , such as
_________________
idella4@aus
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54244
Location: 56N 3W

PostPosted: Tue Nov 23, 2010 7:03 pm    Post subject: Reply with quote

idella4,

Code:
│ [ ] ATA SFF support │
only disables everything in that sub menu. If you don't need anything there, e.g. because your hard drive chip set is AHCI and you have no PATA, then its fine. It wouldn't work for me.

El_Presidente_Pufferfish
Code:
32.018049] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
shows you have a SATA2 controller on the motherboard.
What drive is on the end of the data cable. If its only a SATA1 you may be having SATA fallback issues.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Wed Nov 24, 2010 3:50 pm    Post subject: Reply with quote

@NeddySeagoon
Code:
[    0.519876] ata1.00: ATA-7: INTEL SSDSA2M080G2GC, 2CV102HD, max UDMA/133

The drive is an Intel X-25M 80GB

@idella4
It continues to boot after pausing for approximately 20 seconds.


This drive holds my /boot and / partitions, so I find it very odd that it only hangs after it clears the boot runlevel.
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Wed Nov 24, 2010 4:10 pm    Post subject: Reply with quote

I didn't finish my last entry, got distracted.

Quote:

│ │ [*] ATA SFF support │ │
│ │ < > ServerWorks Frodo / Apple K2 SATA support │ │
│ │ < > Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support │ │

│ │ < > ARTOP/Acard ATP867X PATA support │ │
│ │ < > ATI PATA support │ │
│ │ < > CMD640 PCI PATA support (Experimental) │ │

│ < > Intel PATA MPIIX support │ │
│ │ < > Intel PATA old PIIX support │ │

│ < > Winbond SL82C105 PATA support │ │
│ │ < > Intel SCH PATA support │ │


is what I intended to add. Neddy is the master, he'll be back later on. So you can boot ok, but it baulks.
Neddy will point straight away at the merits of these choices. If it were me, I'd be trying them through a rotation and experiment seeking a fix.
_________________
idella4@aus
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54244
Location: 56N 3W

PostPosted: Wed Nov 24, 2010 5:31 pm    Post subject: Reply with quote

El_Presidente_Pufferfish,

Its SATA2 both ends - which is good. It crossed my mind that you were having problems because SATA speed autonegiotation is often broken between a controller that is faster then the drive. Its good to have ruled that out.

I'm out of ideas meanwhile.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Thu Nov 25, 2010 2:01 am    Post subject: Reply with quote

Here's my whole dmesg if it helps:
http://paste.pocoo.org/show/295747/
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Thu Nov 25, 2010 5:04 am    Post subject: Reply with quote

I did a google search on parts of the error message such as, and there are a few references to your drive model.
this one for example. takem from your opening post. This bug submission supplies a handy hint that suits you & me. I also have a drive, an old ide, playing up something awful and am just getting to figure out how to test & diagnose it.
Set smartctl onto it. The tip that is good fo us both is smartd. Run smartd -d, & you too may get a useful output implicating a drive hardware state. I think it's your drive.
_________________
idella4@aus
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Thu Nov 25, 2010 5:12 am    Post subject: Reply with quote

smartd -d doesn't show anything exciting
Code:
# smartd -d
smartd 5.40 2010-10-16 r3189 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

Opened configuration file /etc/smartd.conf
Drive: DEVICESCAN, implied '-a' Directive on line 23 of file /etc/smartd.conf
Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
glob(3) found no matches for pattern /dev/hd[a-t]
glob(3) found no matches for pattern /dev/sd[a-c][a-z]
Device: /dev/sda, type changed from 'scsi' to 'sat'
Device: /dev/sda [SAT], opened
Device: /dev/sda [SAT], found in smartd database.
Device: /dev/sda [SAT], can't monitor Current Pending Sector count - no Attribute 197
Device: /dev/sda [SAT], can't monitor Offline Uncorrectable Sector count - no Attribute 198
Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.
Device: /dev/sdb, type changed from 'scsi' to 'sat'
Device: /dev/sdb [SAT], opened
Device: /dev/sdb [SAT], found in smartd database.
Device: /dev/sdb [SAT], is SMART capable. Adding to "monitor" list.
Device: /dev/sdc, type changed from 'scsi' to 'sat'
Device: /dev/sdc [SAT], opened
Device: /dev/sdc [SAT], found in smartd database.
Device: /dev/sdc [SAT], is SMART capable. Adding to "monitor" list.
Monitoring 3 ATA and 0 SCSI devices
Device: /dev/sda [SAT], opened ATA device
Device: /dev/sdb [SAT], opened ATA device
Device: /dev/sdc [SAT], opened ATA device


smartctrl -a /dev/sda also doesn't seem out of the ordinary
http://paste.pocoo.org/show/295775/
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Thu Nov 25, 2010 5:20 am    Post subject: Reply with quote

El_Presidente_Pufferfish,

right, that rules that out. Change of approach, it should be in your kernel settings.
How about, save oyur config (oops already in boot). got to the kernel, invoke make defconfig, try its result. If no better, I'll post you my healthy 2.6.36 config, you can adapt that to your pc and test. If I were Neddy, I could single out a single config entry, so I take this shotgun approach. oops Neddy said he was stuck!!!
_________________
idella4@aus
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Thu Nov 25, 2010 5:28 am    Post subject: Reply with quote

I added verbose ATA errors to my kernel config, and now I see:

Code:

[   26.720356] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[   26.720368] ata1.00: failed command: SMART
[   26.720383] ata1.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[   26.720386]          res 40/00:0c:88:76:46/00:00:03:00:00/40 Emask 0x4 (timeout)
[   26.720393] ata1.00: status: { DRDY }
[   26.720404] ata1: hard resetting link
[   27.027039] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   27.029142] ata1.00: configured for UDMA/133
[   27.029179] ata1: EH complete

from dmesg

Full dmesg: http://paste.pocoo.org/show/295781/
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Thu Nov 25, 2010 5:39 am    Post subject: Reply with quote

El_Presidente_Pufferfish

well, ok, that adds to the error log. I suggest to add this previously cited intel PIIX entry to boost your intel kernel drivers. I'm fairly sure your kernel config is just lacking some key entries.

Quote:


│ │ [*] ATA SFF support │ │
│ │ < > ServerWorks Frodo / Apple K2 SATA support │ │
│ │ < > Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support │ │


_________________
idella4@aus
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Thu Nov 25, 2010 5:41 am    Post subject: Reply with quote

It's an AMD system with an nvidia chipset. I don't understand why I would need Intel chipset drivers.
The HDD is Intel, not the chipset
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Thu Nov 25, 2010 5:53 am    Post subject: Reply with quote

ooops, try again.
Mine is nvidia with an intel cpu.

Quote:

.config - Linux Kernel v2.6.34-zen1 "Back in the Saddle" Configuration
────────────────────────────────────────────────────────────────────────────────────────────────
┌─────────────────────────── Serial ATA and Parallel ATA drivers ───────────────────────────┐
│ Arrow keys navigate the menu. <Enter> selects submenus --->. Highlighted letters are │
│ hotkeys. Pressing <Y> includes, <N> excludes, <M> modularizes features. Press │
│ <Esc><Esc> to exit, <?> for Help, </> for Search. Legend: [*] built-in [ ] excluded │
│ <M> module < > module capable │
│ ┌───────────────────────────────────────────────────────────────────────────────────────┐ │
│ │ --- Serial ATA and Parallel ATA drivers │ │
│ │ [*] Verbose ATA error reporting │ │
│ │ [*] ATA ACPI Support │ │
│ │ [*] SATA Port Multiplier support │ │
│ │ <*> AHCI SATA support │ │
│ │ < > Silicon Image 3124/3132 SATA support │ │
│ │ [*] ATA SFF support │ │
│ │ < > ServerWorks Frodo / Apple K2 SATA support │ │
│ │ < > Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support │ │
│ │ < > Marvell SATA support │ │
│ │ <*> NVIDIA SATA support │ │
│ │ < > Pacific Digital ADMA support │ │

│ │ <*> AMD/NVidia PATA support │ │
│ │ < > ARTOP 6210/6260 PATA support │ │
│ │ < > Nat Semi NS87415 PATA support │ │



try that. I must admit I should experiment with this more to pin it down.

Quote:

The HDD is Intel,


Quote:

│ │ < > Intel PATA MPIIX support │ │


I'm fairly sure this addresses the hard drive, not the chipset.
│ │ <*> NVIDIA SATA support │ │
addresses the chipset. From other posts, I gather MPIIX is a bread & butter intel drive driver. Try it out, re-post.
That aside, have you tried make defconfig and observed its selection?
_________________
idella4@aus
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Thu Nov 25, 2010 4:32 pm    Post subject: Reply with quote

If I enable both AHCI SATA Support and NVIDIA SATA Support there is no change in behavior. As far as I can tell the NVIDIA support never loads.

The system doesn't boot if I choose NVIDIA SATA Support and deselect AHCI SATA Support.
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Thu Nov 25, 2010 4:33 pm    Post subject: Reply with quote

Here is my bootchart
http://imgur.com/Ljxtb
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Thu Nov 25, 2010 5:39 pm    Post subject: Reply with quote

El_Presidente_Pufferfish

right. On the chance your kernel drive config is not optimal, could you or have run make defconfig, and describe the settings for your drive.

Your citing of
Code:


[   31.712373] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[   31.712394] ata1.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[   31.712397]          res 40/00:0c:a8:a1:88/00:00:07:00:00/40 Emask 0x4 (timeout)
[   31.712410] ata1: hard resetting link
[   32.018049] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   32.019929] ata1.00: configured for UDMA/133


should represent the heart of the flaw. My suggestions may not be on target, however, those like MPIIX are worth a try, they won't hurt and can only themselves out by making no difference like the tried NVIDIA SATA Support.
Going to run your kernel config again, want to check something.
_________________
idella4@aus
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54244
Location: 56N 3W

PostPosted: Thu Nov 25, 2010 6:37 pm    Post subject: Reply with quote

El_Presidente_Pufferfish,

You have a lot of I/O wait there. Please post your entire dmesg, from time zero.

Also you might like to try kernel gentoo-sources-2.6.36-r2 (or later) as it has some patches that are supposed to address that sort of thing.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Thu Nov 25, 2010 6:38 pm    Post subject: Reply with quote

NeddySeagoon: here's my most recent dmesg:
http://paste.pocoo.org/show/295985/

I'll try upgrading my kernel
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Thu Nov 25, 2010 6:55 pm    Post subject: Reply with quote

emerged 2.6.36-r3 since -r2 was hard masked. No change.
dmesg: http://paste.pocoo.org/show/295996/
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54244
Location: 56N 3W

PostPosted: Thu Nov 25, 2010 7:12 pm    Post subject: Reply with quote

El_Presidente_Pufferfish,

Sorry about the kernel wild goose chase. The patches are only for AMD64 and you are running x86.

Code:
[    6.167732] forcedeth 0000:00:0a.0: irq 41 for MSI/MSI-X
[   26.720360] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

This 20 second gap is interesting.

Its probably wired networking trying to use DHCP and failing - that would be about the time out interval for DHCP.

Code:
[   26.720372] ata1.00: failed command: SMART
Does your SSD support SMART?
Its probably not useful on a SSD as most of the data it returns relates to mechanical problems with the drive and SSDs don't have mechanical problems.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Thu Nov 25, 2010 7:35 pm    Post subject: Reply with quote

welcome Neddy.
ok, I have reloaded your config into my 2.6.36-gentoo. I have added firstly nvidiafb which didn't want to go in, so I emerged nvidia-drivers to get into WM. I have only added a few graphical type settings to get this far.
I have access only to my main drive and a usb stick. So it's as close as can be to your initial config which faulters.
We both have CONFIG_X86_32 but different cpu processors & nvidia chipset.

El_Presidente_Pufferfish, whaere & how did you notice the extra I/O wait? by just observing a boot chart or by the boot noticeably baulking while you sat in front of the screen?

Code:

(none) bin # uname -a
Linux (none) 2.6.36-gentoo-r1 #10 SMP Fri Nov 26 03:02:22 WST 2010 i686 Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz GenuineIntel GNU/Linux
(none) bin # ls /dev/sd*
/dev/sda   /dev/sda10  /dev/sda2  /dev/sda4  /dev/sda6  /dev/sda8  /dev/sdb
/dev/sda1  /dev/sda11  /dev/sda3  /dev/sda5  /dev/sda7  /dev/sda9  /dev/sdb1

idella@(none) ~/bin $ sudo -s grep frozen /var/log/messages
Nov 22 18:40:06 genny kernel: [  101.704082] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:42:51 genny kernel: [  266.720081] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:56:33 genny kernel: [  243.808026] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:57:38 genny kernel: [  308.320031] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:58:08 genny kernel: [  339.040029] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Nov 22 18:58:39 genny kernel: [  370.016146] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

Code:

(none) bin # grep timeout /var/log/messages
................
Nov 24 00:09:48 genny kernel: [    3.710237] Testing event scsi_dispatch_cmd_timeout: OK
Nov 24 00:09:48 genny kernel: [    4.166824] Testing event scsi_dispatch_cmd_timeout: OK

(none) bin # grep timeout /var/log/dmesg

Code:

/var/log/messages:Nov 25 12:49:20 genny kernel: [ 8731.539684] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
/var/log/messages:Nov 25 12:51:47 genny kernel: [ 8878.947132] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x
(none) bin # grep exception Emask 0x0 SAct  /var/log/dmesg   
grep: Emask: No such file or directory
grep: 0x0: No such file or directory
grep: SAct: No such file or directory
/var/log/dmesg:[    0.000000] Enabling unmasked SIMD FPU exception support... done.
(none) bin # grep exception  /var/log/dmesg
[    0.000000] Enabling unmasked SIMD FPU exception support... done.
(none) bin # grep Emask  /var/log/dmesg
(none) bin # grep SAct /var/log/dmesg


No sign of the log entries that you are experiencing. The occurrences in messages date back before today (26 Nov)
dmesg has no mention of timeout, frozen, Emask or SAct.

Now Let's let Neddy put all this together. I think it's , let's see

NeddySeagoon wrote:

Its probably wired networking trying to use DHCP and failing - that would be about the time out interval for DHCP.

Neddy, hardwired and dhcp here. No delay that I could discern. Worth posting the dmesg from here?
_________________
idella4@aus
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54244
Location: 56N 3W

PostPosted: Thu Nov 25, 2010 8:30 pm    Post subject: Reply with quote

idella4,

Its all in http://paste.pocoo.org/show/295985/
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum