Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Kernel upgrade breaks my Intel C606 SAS
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Zolcos
n00b
n00b


Joined: 28 Oct 2012
Posts: 29

PostPosted: Wed Jun 19, 2013 11:52 pm    Post subject: Kernel upgrade breaks my Intel C606 SAS Reply with quote

I tried upgrading my kernel from 3.5.4-r1 to 3.8.6. The new kernel can boot OK except the drives on my Intel C606 SAS controller are no longer recognized. The controller itself does seem to be recognized as I see its name flash by on boot and I get the right number of /dev/sas_host* devices.
In configuring the new kernel I made sure to use the same settings in regard to firmware. The old kernel loads the driver as a module and the firmware file is located at /lib/firmware/isci/isci_firmware.bin. I can still boot the old kernel and this process works.
I also tried something different -- making the c606 driver builitin and telling it to include the firmware blob in the kernel. (This is what I wanted to do anyway but the old kernel would panic on boot when I tried this). This time, it had the same result as the module approach -- the kernel boots successfully, the c606 is recognized, but the drives do not appear.

I should note there weren't any errors on the kernel make aside from a lot of "section mismatch" but I always see that message.
Is there a way I can check at runtime if the firmware file is being loaded successfully by the kernel? It would at least rule out the firmware related configuration as the culprit here... I'm only so fixated on that because it was the source of a similar problem when I initially installed Gentoo on this machine.
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Thu Jun 20, 2013 6:53 pm    Post subject: Reply with quote

Did you try enabling SCSI_LOGGING? You need /proc filesystem
and sysctrl support, too, to enable it.

There is a script for configuring it in the sg3_utils package. It ends
up as "/usr/sbin/scsi_logging_level". There is a man page for the script,
and being a shell script, you can read the script itself.

[edit:] Since this needs to be enabled after boot by writing to a file
in /proc, it may not enhance the information that you get in dmesg
for scsi device initialization.[/edit]

That might provide a little more information in dmesg [not; see edit above]
when scsi devices are initialized by the kernel. Messages after boot probably
end up in /var/log/ somewhere. (I install sysklogd and have klogd log kernel
messages directly to a file in /var/log/, rather than going through syslog,
so I always know where to look for kernel messages. Requires some
/etc/syslog.conf reconfiguration from defaults.)
_________________
TIA
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Thu Jun 20, 2013 7:11 pm    Post subject: Reply with quote

PS: If you can figure out what all the script does, you may be able to hack
around in /usr/src/linux/drivers/scsi/scsi_logging.h to enable it at boot,
but I did not immediately see what to change looking at it.

(It would be useful if one could pass a kernel parameter via grub's kernel
command line to enable and configure it. But I did not see an option for that
documented anywhere.)
_________________
TIA
Back to top
View user's profile Send private message
Zolcos
n00b
n00b


Joined: 28 Oct 2012
Posts: 29

PostPosted: Sun Jun 23, 2013 4:14 am    Post subject: Reply with quote

SCSI_LOGGING is enabled and I have that script but getting the logging to happen when the devices are first being initialized seems like a challenge.
btw I should mention that my system uses mdev because of that udev update that broke a lot of gentoo boxes (I have a separate /var), not sure if it makes a difference here
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Mon Jun 24, 2013 2:08 pm    Post subject: Reply with quote

I have no idea if mdev would be responsible. I got eudev working
alright. It supports having /var and /usr on separate filesystems
(like udev before the big change). Works more or less the same
as <= udev-181. The one difference is that you need an empty
/etc/udev/rules.d/80-net-name-slot.rules file to prevent eudev
from following udev's "persistent network names" behavior
(prevent it from renaming eth0 and/or wlan0 to something else
at boot).

(I left out the "legacy-libudev" USE flag for eudev. Nothing broke.)
_________________
TIA
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Mon Jun 24, 2013 2:32 pm    Post subject: Reply with quote

PS:
When sg3_utils was installed, emerge installed some dependency of it, and
rescan_scsi_bus, which is actually a shell script in /usr/sbin,
rescan_scsi_bus.sh. Maybe scsi_logging_level in combination with that
would get you some useful info. I did not see a man page for it,
but it has help in the script, ie
Code:

rescan_scsi_bus -h


(The one thing I did not see was what the syntax for a "Host"
should be, ie "--hosts=[LIST]". LIST of what? Scsi hosts, we
know that, but specified how exactly? A pci bus address?
Maybe the way they are listed in dmesg when encountered
by the kernel at boot? 0-1-2-3? There are some symbolic links
in /sys/bus/scsi/devices/, maybe you can try those for the names
of "hosts" for rescan_scsi_bus.)

Anyway, if you figure out how to specify a host to it, it can search
for luns, targets, etc. And if it is not finding any, SCSI_LOGGING
can probably report the details if it has been enabled with
scsi_logging_level.
_________________
TIA
Back to top
View user's profile Send private message
Zolcos
n00b
n00b


Joined: 28 Oct 2012
Posts: 29

PostPosted: Thu Jul 04, 2013 4:54 am    Post subject: Reply with quote

I had to install rescan-scsi-bus separately, and here's what I got:
With the old (working) kernel: http://pastebin.com/qE0JyYWT
With the new kernel: http://pastebin.com/W3Bt39AC
Looks like it finds the icsi device (c606) alright, but doesn't see the four drives that are on it when using the new kernel.

I was able to configure my logger to put kernel messages outside /var so I can get them when /var isn't working. I'm working on getting it to catch something interesting from scsi_logging.
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Fri Jul 05, 2013 3:07 pm    Post subject: Reply with quote

There is the command line option
Code:

--nooptscan

but according to the embedded help it only applies to a LUN
search ("optimize scan" is apparently an early out if LUN 0 is
not found; --nooptscan would instruct it to keep going in that
case). I do not know if it applies to targets as well as LUNs.

Anyway, you can experiment with command line options that
do not look risky (that only modify what information is searched
for and where it looks for it exactly), like
Code:

--channels= 0 1


I looked at the script, it seems to mostly use information in /sys/
and /proc/, so it is relying on the kernel to find the hardware at
boot and query it for configuration information.

I find the fact that the old kernel output reports the onboard SAS
hosts as iscsi and the new one reports them as ahci suspicious.
(Is the kernel using the correct driver for these devices?)

There is a linux-scsi mailing list with a lot of nuts-and-bolts discussion
of kernel scsi code:
http://vger.kernel.org/vger-lists.html#linux-scsi
_________________
TIA


Last edited by wcg on Sat Jul 06, 2013 2:25 am; edited 1 time in total
Back to top
View user's profile Send private message
Zolcos
n00b
n00b


Joined: 28 Oct 2012
Posts: 29

PostPosted: Sat Jul 06, 2013 2:07 am    Post subject: Reply with quote

wcg wrote:
I find the fact that the old kernel output reports the onboard SAS
hosts as iscsi and the new one reports them as ahci suspicious.

I think they are still being detected correctly, just in a different order -- you can see the ocz-vertex drives end up on hosts 2 and 3 in the new kernel instead of 0 and 1 like in the old kernel
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Sat Jul 06, 2013 2:41 am    Post subject: Reply with quote

[edit:] Oh, I see what you are saying, the iscsi devices are
hosts 0 and 1 when the new kernel boots. So, if it is using
the correct device driver for those SAS host interfaces,
I have no idea why the kernel would fail to detect the
connected drives at boot. And both kernels detect the
same drives connected to the mpt2sas controller, so
the problem seems to be specific to the driver for the
C606 SAS controller rather than to all SAS controllers.
I still think you need to consult the mailing list for
bug/patch reports.[/edit]

[deleted blather that reflected not consulting the two pastebins
for clarification.]

Anyway, you can search the mailing list for posts
with keywords "C606 ISCSI missing drives". Someone else may
have already reported your problem and newer kernels
may already have a boot probe device detection patch for it.
_________________
TIA
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Sat Jul 06, 2013 4:31 am    Post subject: Reply with quote

PS:
I hesitate to mention this, it is becoming such a cliche these
days of aggressive power saving invading every aspect of
the operation of our hardware, but I wonder if on-board
acpi code is spinning down the drives before the kernel can
detect them. It would be a bizarre kind of error that boot
code could easily work around if it knew that could happen.

Something to keep in mind if you don't come across any
other reports of the same problem on the same hardware.
(Not very likely. A change in the kernel like this usually arrives
with more than one report on various mailing lists.)
_________________
TIA
Back to top
View user's profile Send private message
wcg
Guru
Guru


Joined: 06 Jan 2009
Posts: 588

PostPosted: Sat Jul 06, 2013 1:05 pm    Post subject: Reply with quote

[edit:]After some www browsing and rechecking /usr/src/linux/.config, I see
that the C600 SAS driver is actually "ISCI" rather than "ISCSI". That makes
searching for bug reports a little more effective, since one can exclude
"iscsi" hits.[/edit]

[obsolete comments]
When searching for relevant bug reports, one needs to look
closely to be sure the report is relevant to the ISCSI driver
("Intel SCSI" for the intel SAS controllers) and not "iSCSI",
an "scsi over networks" protocol that predates the intel
hardware driver. ( http://linux-iscsi.org/wiki/Main_Page )

(I did not see anything relevant in the linux-acpi bugzilla
or mailing list.)
[/obs]

[edit:]
This a very new driver, in terms of mainline kernel integration.
Intel had an internal version sometime before, but the source
in the public kernels was first introduced into 3.0-rc6, with a
big set of patches merged in 3.7-rc6. I would have expected
it to work better in a 3.8.x kernel than a 3.5.x kernel, but
your results demonstrate that would be overly optimistic.

Even using the correct search term ("isci"), I still do not find
anything relevant to that SAS driver in the linux-acpi bugzilla
and mailing list. That does not necessarily mean that acpi is
not the real problem, only that no one else has identified it as
such. ( https://01.org/linux-acpi )

With luck, SCSI_LOGGING turns up some useful information.
[/edit]
_________________
TIA
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum