Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
SATA link timout after boot
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
viralex
Apprentice
Apprentice


Joined: 24 Apr 2008
Posts: 237
Location: Viareggio (Lu,Italy)

PostPosted: Thu Nov 25, 2010 10:35 pm    Post subject: Reply with quote

I had these problems today....
like hardreset sata link, system was extremely slow!
and some irq22/sata errors.

Code:

Nov 25 19:03:29 cylon kernel: [   63.593499] irq 22: nobody cared (try booting with the "irqpoll" option)
Nov 25 19:03:29 cylon kernel: [   63.593503] Pid: 0, comm: swapper Not tainted 2.6.36-gentoo-r3 #1
Nov 25 19:03:29 cylon kernel: [   63.593504] Call Trace:
Nov 25 19:03:29 cylon kernel: [   63.593506]  <IRQ>  [<ffffffff810774ae>] ? __report_bad_irq+0x1e/0x90
Nov 25 19:03:29 cylon kernel: [   63.593515]  [<ffffffff810776ab>] ? note_interrupt+0x18b/0x1d0
Nov 25 19:03:29 cylon kernel: [   63.593518]  [<ffffffff81077e54>] ? handle_fasteoi_irq+0xb4/0xe0
Nov 25 19:03:29 cylon kernel: [   63.593522]  [<ffffffff810052f5>] ? handle_irq+0x15/0x20
Nov 25 19:03:29 cylon kernel: [   63.593524]  [<ffffffff81004842>] ? do_IRQ+0x62/0xe0
Nov 25 19:03:29 cylon kernel: [   63.593527]  [<ffffffff81430a93>] ? ret_from_intr+0x0/0xa
Nov 25 19:03:29 cylon kernel: [   63.593529]  <EOI>  [<ffffffff8100a54e>] ? mwait_idle+0x6e/0x80
Nov 25 19:03:29 cylon kernel: [   63.593533]  [<ffffffff810014d8>] ? cpu_idle+0xa8/0x100
Nov 25 19:03:29 cylon kernel: [   63.593536]  [<ffffffff81626c42>] ? start_kernel+0x30e/0x319
Nov 25 19:03:29 cylon kernel: [   63.593539]  [<ffffffff816263c7>] ? x86_64_start_kernel+0xe8/0xec
Nov 25 19:03:29 cylon kernel: [   63.593540] handlers:
Nov 25 19:03:29 cylon kernel: [   63.593541] [<ffffffff812dc050>] (ata_bmdma_interrupt+0x0/0x210)
Nov 25 19:03:29 cylon kernel: [   63.593546] [<ffffffff812dc050>] (ata_bmdma_interrupt+0x0/0x210)
Nov 25 19:03:29 cylon kernel: [   63.593548] [<ffffffff8137f4a0>] (azx_interrupt+0x0/0x180)


&&

Code:

Nov 25 19:03:29 cylon kernel: [   63.593527]  [<ffffffff81430a93>] ? ret_from_intr+0x0/0xa
Nov 25 19:03:29 cylon kernel: [   63.593529]  <EOI>  [<ffffffff8100a54e>] ? mwait_idle+0x6e/0x80
Nov 25 19:03:29 cylon kernel: [   63.593533]  [<ffffffff810014d8>] ? cpu_idle+0xa8/0x100
Nov 25 19:03:29 cylon kernel: [   63.593536]  [<ffffffff81626c42>] ? start_kernel+0x30e/0x319
Nov 25 19:03:29 cylon kernel: [   63.593539]  [<ffffffff816263c7>] ? x86_64_start_kernel+0xe8/0xec
Nov 25 19:03:29 cylon kernel: [   63.593540] handlers:
Nov 25 19:03:29 cylon kernel: [   63.593541] [<ffffffff812dc050>] (ata_bmdma_interrupt+0x0/0x210)
Nov 25 19:03:29 cylon kernel: [   63.593546] [<ffffffff812dc050>] (ata_bmdma_interrupt+0x0/0x210)
Nov 25 19:03:29 cylon kernel: [   63.593548] [<ffffffff8137f4a0>] (azx_interrupt+0x0/0x180)
Nov 25 19:03:29 cylon kernel: [   63.593552] Disabling IRQ #22
Nov 25 19:03:59 cylon kernel: [   92.768016] ata4: lost interrupt (Status 0x51)
Nov 25 19:03:59 cylon kernel: [   92.768031] ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen
Nov 25 19:03:59 cylon kernel: [   92.768032] ata4.00: BMDMA stat 0x26, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0
Nov 25 19:03:59 cylon kernel: [   92.768037] ata4.00: cmd 25/00:00:47:38:5b/00:01:24:00:00/e0 tag 0 dma 131072 in
Nov 25 19:03:59 cylon kernel: [   92.768038]          res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x34 (host bus error)
Nov 25 19:03:59 cylon kernel: [   92.768044] ata4: hard resetting link
Nov 25 19:03:59 cylon kernel: [   93.224047] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov 25 19:03:59 cylon kernel: [   93.393038] ata4.00: configured for UDMA/133
Nov 25 19:03:59 cylon kernel: [   93.393087] ata4.00: device reported invalid CHS sector 0
Nov 25 19:03:59 cylon kernel: [   93.393098] ata4: EH complete


I think irq22 is hda audio intel ( is it a conflict ??)
Code:

Nov 25 22:49:35 cylon kernel: [    1.149362]   #0: HDA Intel at 0xf9ff8000 irq 22


Then my bios didn't recognized my disk

My mobo is intel p5ke, and there are 2 sata chipsets: yukon and intel ich...

I decided to reboot system and after that I had 2 kernel panics, so I've switched to yukon's sata sockets on the motherboard.
I think they're yukon's chipset. I have 5 sata sockets red and two black (the black ones are for raid?). I've switched to black from red ones...
I'm not sure, don't know if all sockets are of intel's ich or if yukon is only for cd/dvd ide interface...
The bios was configured configured into IDE enhaced mode, now on ACHI.

These errors are very strange.... I've also switched back to 2.6.36-gentoo-r1 from r3

I can't understand what's the problem,
I see it is a common problem, because I've found this thead.
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Sat Nov 27, 2010 2:09 am    Post subject: Reply with quote

@idella4:
I noticed the wait by just watching the boot process. Booting used to take ~5s total.

I don't think DHCP is the culprit. It doesn't explain the I/O wait, does it?
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Sat Nov 27, 2010 3:43 am    Post subject: Reply with quote

If I remove the line

from /etc/smartd.conf, boot proceeds quickly.

Nothing seems odd if I add the line back after boot, and run smartd -d, however
Code:

# smartd -d
smartd 5.40 2010-10-16 r3189 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

Opened configuration file /etc/smartd.conf
Configuration file /etc/smartd.conf parsed.
Device: /dev/sda, type changed from 'scsi' to 'sat'
Device: /dev/sda [SAT], opened
Device: /dev/sda [SAT], found in smartd database.
Device: /dev/sda [SAT], enabled SMART Attribute Autosave.
Device: /dev/sda [SAT], can't monitor Current Pending Sector count - no Attribute 197
Device: /dev/sda [SAT], can't monitor Offline Uncorrectable Sector count - no Attribute 198
Device: /dev/sda [SAT], SMART Automatic Offline Testing unsupported...
Device: /dev/sda [SAT], enabled SMART Automatic Offline Testing.
Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.
Device: /dev/sdb, type changed from 'scsi' to 'sat'
Device: /dev/sdb [SAT], opened
Device: /dev/sdb [SAT], found in smartd database.
Device: /dev/sdb [SAT], enabled SMART Attribute Autosave.
Device: /dev/sdb [SAT], enabled SMART Automatic Offline Testing.
Device: /dev/sdb [SAT], is SMART capable. Adding to "monitor" list.
Device: /dev/sdc, type changed from 'scsi' to 'sat'
Device: /dev/sdc [SAT], opened
Device: /dev/sdc [SAT], found in smartd database.
Device: /dev/sdc [SAT], enabled SMART Attribute Autosave.
Device: /dev/sdc [SAT], enabled SMART Automatic Offline Testing.
Device: /dev/sdc [SAT], is SMART capable. Adding to "monitor" list.
Monitoring 3 ATA and 0 SCSI devices
Device: /dev/sda [SAT], opened ATA device
Device: /dev/sdb [SAT], opened ATA device
Device: /dev/sdc [SAT], opened ATA device


If I add it after boot, and restart smartd, there is no error message in dmesg.
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Sat Nov 27, 2010 8:03 am    Post subject: Reply with quote

El_Presidente_Pufferfish wrote:
If I remove the line

from /etc/smartd.conf, boot proceeds quickly.

Nothing seems odd if I add the line back after boot, and run smartd -d, however


I think you forgot to add the line you were intending to cite. I have one, and the only line no commented out is "DEVICESCAN".
So assuming you have used this line, it appears you have pinned it.
To support this notion.

Code:

genny bin # smartd -d

....................................

Monitoring 5 ATA and 0 SCSI devices
Device: /dev/sda [SAT], opened ATA device
Device: /dev/sdb [SAT], opened ATA device
Device: /dev/sdc [SAT], opened ATA device
Device: /dev/sdc [SAT], 92 Currently unreadable (pending) sectors
Device: /dev/sdc [SAT], 1 Offline uncorrectable sectors
Device: /dev/sdc [SAT], previous self-test completed with error (read test element)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       40%      6319         7346379
# 2  Extended offline    Completed: read failure       40%      6317         7346379
# 3  Extended offline    Completed: read failure       40%      6316         7346379
# 4  Short offline       Completed: read failure       60%      6316         7346379
# 5  Short offline       Completed: read failure       60%      6315         7346379



If I had smartd in my rc-update boot list, I should get a similar result. I have a dud /dev/sdc with read errors on some sectors. but I have never opted to add smartd to rc.
I rran your config twice on my pv and it was ok. It looks like you've found it, or are you still looking for a more comprehensive outcome?
_________________
idella4@aus
Back to top
View user's profile Send private message
El_Presidente_Pufferfish
Veteran
Veteran


Joined: 11 Jul 2002
Posts: 1179
Location: Seattle

PostPosted: Sat Nov 27, 2010 8:25 am    Post subject: Reply with quote

Whoops.

I went from
Code:

/dev/sda -S on -o on -a -s (S/../.././02|L/../../4/03) -m root
/dev/sdb -S on -o on -a -s (S/../.././02|L/../../4/03) -m root
/dev/sdc -S on -o on -a -s (S/../.././02|L/../../4/03) -m root


to
Code:

/dev/sdb -S on -o on -a -s (S/../.././02|L/../../4/03) -m root
/dev/sdc -S on -o on -a -s (S/../.././02|L/../../4/03) -m root

and the boot delay was gone.

While that answers why my boot was delayed, it does not answer why smartd causes the delay at boot.

Furthermore, if I remove smartd from the default runlevel and start it manually, there is no similar delay.

Since I want SMART monitoring for that drive, I have to leave it enabled and endure the boot delay
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54244
Location: 56N 3W

PostPosted: Sat Nov 27, 2010 10:03 am    Post subject: Reply with quote

El_Presidente_Pufferfish,

The dirty hack is to start smartd in /etc/conf.d/local so you get the best of both worlds.
Meanwhile file a bug so the devs look at it.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
idella4
Retired Dev
Retired Dev


Joined: 09 Jun 2006
Posts: 1600
Location: Australia, Perth

PostPosted: Sat Nov 27, 2010 10:35 am    Post subject: Reply with quote

Mr puffer-fish

Neddy's tips are always reliable. If you'd like another dirty hack, incorporate a hibernation. I've just recently tried it out for the first time, never knowing what it's about. Bootup, start smatrd, when you finish your session, hibernate the computer, effectively avoiding reboots.

s2ram -f
switch it back on when you do, pick up where you left off. . I wonder if this would make Neddy shudder!!!
_________________
idella4@aus
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54244
Location: 56N 3W

PostPosted: Sat Nov 27, 2010 10:44 am    Post subject: Reply with quote

idella4,

hibernation is on my list of things to play with for my netbook.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum