View previous topic :: View next topic |
Author |
Message |
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Mon Nov 14, 2016 7:55 am Post subject: Intel C602 SAS/SATA controller causes kernel panic |
|
|
I recently (like last week) upgraded my home server to Supermicro X9DRL-iF board with Xeon E5-2670 processor.
My problems started when I tried to use the additional SATA controller (ports 7 to 10 on the board) that are supported by the C602 chipset:
C602 chipset 4-Port SATA Storage Control Unit
https://cateee.net/lkddb/web-lkddb/SCSI_ISCI.html
SATA ports 1 to 6 are "internal" and are working OK.
My problems are similar to that topic https://forums.gentoo.org/viewtopic-t-958726-start-0.html
Drives weren't detected at all, I found that I have to enable CONFIG_SCSI_SAS_ATA config option, when I did that, boom, kernel panic, null pointer dereference (gentoo hardened 4.8.7, same was on 4.7.x):
https://goo.gl/photos/SgN9reyPw6jjm8LS8
Attached are SATA drives, 3Gbps, 1.5 TB, Samsungs. C602 should support them.
Anybody has any experience with this chipset? |
|
Back to top |
|
|
Roman_Gruber Advocate
Joined: 03 Oct 2006 Posts: 3846 Location: Austro Bavaria
|
Posted: Mon Nov 14, 2016 12:16 pm Post subject: |
|
|
You may check bugs.kernel.org (or what its called) and report there please.
Only idea. Check the wiring of the box. replug everthing. try a livemedia to check if it happens there too. |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Mon Nov 14, 2016 6:40 pm Post subject: |
|
|
Thanks for kernel bugzilla hint.
I found an old issue there for C602 chipset: https://bugzilla.kernel.org/show_bug.cgi?id=60644
Quote: | I ran more detailed tests this weekend.
ASPM & MSI disabled = stable machine under zfs load
ASPM disabled / MSI enabled = stable machine under zfs load
ASPM enabled / MSI disabled = unstable, lost an HBA under zfs load
Hardware:
Supermicro X8DTH-iF, BIOS 2.1b (current)
2x Xeon X5670, 48GB DDR3 1333Mhz Reg/ECC
3x LSI 9207-8i, phase 18 firmware
36x Seagate ST32000444SS
It appears to be ASPM and vulnerability to issue may vary by chipset. |
I remeber that I enabled quite aggresive power management in BIOS and also many PM options in kernel. Worth investigating. |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Tue Nov 29, 2016 6:53 am Post subject: |
|
|
I just found out that I enabled this option:
Code: | CONFIG_SCSI_MQ_DEFAULT:
x
x This option enables the new blk-mq based I/O path for SCSI
x devices by default. With the option the scsi_mod.use_blk_mq
x module/boot option defaults to Y, without it to N, but it can
x still be overridden either way.
x
x If unsure say N.
x
x Symbol: SCSI_MQ_DEFAULT [=y]
x Type : boolean
x Prompt: SCSI: use blk-mq I/O path by default
x Location:
x -> Device Drivers
x -> SCSI device support
x Defined at drivers/scsi/Kconfig:49
x Depends on: SCSI [=y] |
And the kernel panic clearly references blk-mq code. I'll try to disable it and test again. Maybe the bug is somewhere there. |
|
Back to top |
|
|
mbar Veteran
Joined: 19 Jan 2005 Posts: 1990 Location: Poland
|
Posted: Wed Nov 30, 2016 4:43 pm Post subject: |
|
|
It kinda works after disabling blk-mq code. Disks are accessible and working, but I have this in dmesg:
Code: | [ 52.904287] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[ 52.929201] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 52.929205] sas: ata13: end_device-0:2: cmd error handler
[ 52.929262] sas: ata11: end_device-0:0: dev error handler
[ 52.929275] sas: ata12: end_device-0:1: dev error handler
[ 52.929280] sas: ata13: end_device-0:2: dev error handler
[ 52.936164] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[ 172.948132] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 172.948150] sas: ata13: end_device-0:2: cmd error handler
[ 172.948212] sas: ata11: end_device-0:0: dev error handler
[ 172.948230] sas: ata12: end_device-0:1: dev error handler
[ 172.948236] sas: ata13: end_device-0:2: dev error handler
[ 172.955224] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[ 172.976115] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 172.976121] sas: ata13: end_device-0:2: cmd error handler
[ 172.976189] sas: ata11: end_device-0:0: dev error handler
[ 172.976203] sas: ata12: end_device-0:1: dev error handler
[ 172.976208] sas: ata13: end_device-0:2: dev error handler
[ 172.983135] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[ 173.004110] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 173.004115] sas: ata13: end_device-0:2: cmd error handler
[ 173.004172] sas: ata11: end_device-0:0: dev error handler
[ 173.004200] sas: ata12: end_device-0:1: dev error handler
[ 173.004205] sas: ata13: end_device-0:2: dev error handler
[ 173.011160] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[ 173.040125] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 173.040130] sas: ata13: end_device-0:2: cmd error handler
[ 173.040221] sas: ata11: end_device-0:0: dev error handler
[ 173.040235] sas: ata12: end_device-0:1: dev error handler
[ 173.040240] sas: ata13: end_device-0:2: dev error handler
[ 173.047200] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[ 173.068107] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 173.068112] sas: ata13: end_device-0:2: cmd error handler
[ 173.068168] sas: ata11: end_device-0:0: dev error handler
[ 173.068194] sas: ata12: end_device-0:1: dev error handler
[ 173.068199] sas: ata13: end_device-0:2: dev error handler
[ 173.075148] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[ 173.096140] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[ 173.096145] sas: ata13: end_device-0:2: cmd error handler
[ 173.096199] sas: ata11: end_device-0:0: dev error handler
[ 173.096215] sas: ata12: end_device-0:1: dev error handler
[ 173.096220] sas: ata13: end_device-0:2: dev error handler
[ 173.103144] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
|
I also have PCI-Ex SiI 3132 Serial ATA Raid II Controller that is working stable and I may just as well use that. |
|
Back to top |
|
|
|