View previous topic :: View next topic |
Author |
Message |
Zolcos n00b
Joined: 28 Oct 2012 Posts: 39
|
Posted: Wed Jun 19, 2013 11:52 pm Post subject: Kernel upgrade breaks my Intel C606 SAS |
|
|
I tried upgrading my kernel from 3.5.4-r1 to 3.8.6. The new kernel can boot OK except the drives on my Intel C606 SAS controller are no longer recognized. The controller itself does seem to be recognized as I see its name flash by on boot and I get the right number of /dev/sas_host* devices.
In configuring the new kernel I made sure to use the same settings in regard to firmware. The old kernel loads the driver as a module and the firmware file is located at /lib/firmware/isci/isci_firmware.bin. I can still boot the old kernel and this process works.
I also tried something different -- making the c606 driver builitin and telling it to include the firmware blob in the kernel. (This is what I wanted to do anyway but the old kernel would panic on boot when I tried this). This time, it had the same result as the module approach -- the kernel boots successfully, the c606 is recognized, but the drives do not appear.
I should note there weren't any errors on the kernel make aside from a lot of "section mismatch" but I always see that message.
Is there a way I can check at runtime if the firmware file is being loaded successfully by the kernel? It would at least rule out the firmware related configuration as the culprit here... I'm only so fixated on that because it was the source of a similar problem when I initially installed Gentoo on this machine. |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Thu Jun 20, 2013 6:53 pm Post subject: |
|
|
Did you try enabling SCSI_LOGGING? You need /proc filesystem
and sysctrl support, too, to enable it.
There is a script for configuring it in the sg3_utils package. It ends
up as "/usr/sbin/scsi_logging_level". There is a man page for the script,
and being a shell script, you can read the script itself.
[edit:] Since this needs to be enabled after boot by writing to a file
in /proc, it may not enhance the information that you get in dmesg
for scsi device initialization.[/edit]
That might provide a little more information in dmesg [not; see edit above]
when scsi devices are initialized by the kernel. Messages after boot probably
end up in /var/log/ somewhere. (I install sysklogd and have klogd log kernel
messages directly to a file in /var/log/, rather than going through syslog,
so I always know where to look for kernel messages. Requires some
/etc/syslog.conf reconfiguration from defaults.) _________________ TIA |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Thu Jun 20, 2013 7:11 pm Post subject: |
|
|
PS: If you can figure out what all the script does, you may be able to hack
around in /usr/src/linux/drivers/scsi/scsi_logging.h to enable it at boot,
but I did not immediately see what to change looking at it.
(It would be useful if one could pass a kernel parameter via grub's kernel
command line to enable and configure it. But I did not see an option for that
documented anywhere.) _________________ TIA |
|
Back to top |
|
|
Zolcos n00b
Joined: 28 Oct 2012 Posts: 39
|
Posted: Sun Jun 23, 2013 4:14 am Post subject: |
|
|
SCSI_LOGGING is enabled and I have that script but getting the logging to happen when the devices are first being initialized seems like a challenge.
btw I should mention that my system uses mdev because of that udev update that broke a lot of gentoo boxes (I have a separate /var), not sure if it makes a difference here |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Mon Jun 24, 2013 2:08 pm Post subject: |
|
|
I have no idea if mdev would be responsible. I got eudev working
alright. It supports having /var and /usr on separate filesystems
(like udev before the big change). Works more or less the same
as <= udev-181. The one difference is that you need an empty
/etc/udev/rules.d/80-net-name-slot.rules file to prevent eudev
from following udev's "persistent network names" behavior
(prevent it from renaming eth0 and/or wlan0 to something else
at boot).
(I left out the "legacy-libudev" USE flag for eudev. Nothing broke.) _________________ TIA |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Mon Jun 24, 2013 2:32 pm Post subject: |
|
|
PS:
When sg3_utils was installed, emerge installed some dependency of it, and
rescan_scsi_bus, which is actually a shell script in /usr/sbin,
rescan_scsi_bus.sh. Maybe scsi_logging_level in combination with that
would get you some useful info. I did not see a man page for it,
but it has help in the script, ie
(The one thing I did not see was what the syntax for a "Host"
should be, ie "--hosts=[LIST]". LIST of what? Scsi hosts, we
know that, but specified how exactly? A pci bus address?
Maybe the way they are listed in dmesg when encountered
by the kernel at boot? 0-1-2-3? There are some symbolic links
in /sys/bus/scsi/devices/, maybe you can try those for the names
of "hosts" for rescan_scsi_bus.)
Anyway, if you figure out how to specify a host to it, it can search
for luns, targets, etc. And if it is not finding any, SCSI_LOGGING
can probably report the details if it has been enabled with
scsi_logging_level. _________________ TIA |
|
Back to top |
|
|
Zolcos n00b
Joined: 28 Oct 2012 Posts: 39
|
Posted: Thu Jul 04, 2013 4:54 am Post subject: |
|
|
I had to install rescan-scsi-bus separately, and here's what I got:
With the old (working) kernel: http://pastebin.com/qE0JyYWT
With the new kernel: http://pastebin.com/W3Bt39AC
Looks like it finds the icsi device (c606) alright, but doesn't see the four drives that are on it when using the new kernel.
I was able to configure my logger to put kernel messages outside /var so I can get them when /var isn't working. I'm working on getting it to catch something interesting from scsi_logging. |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Fri Jul 05, 2013 3:07 pm Post subject: |
|
|
There is the command line option
but according to the embedded help it only applies to a LUN
search ("optimize scan" is apparently an early out if LUN 0 is
not found; --nooptscan would instruct it to keep going in that
case). I do not know if it applies to targets as well as LUNs.
Anyway, you can experiment with command line options that
do not look risky (that only modify what information is searched
for and where it looks for it exactly), like
I looked at the script, it seems to mostly use information in /sys/
and /proc/, so it is relying on the kernel to find the hardware at
boot and query it for configuration information.
I find the fact that the old kernel output reports the onboard SAS
hosts as iscsi and the new one reports them as ahci suspicious.
(Is the kernel using the correct driver for these devices?)
There is a linux-scsi mailing list with a lot of nuts-and-bolts discussion
of kernel scsi code:
http://vger.kernel.org/vger-lists.html#linux-scsi _________________ TIA
Last edited by wcg on Sat Jul 06, 2013 2:25 am; edited 1 time in total |
|
Back to top |
|
|
Zolcos n00b
Joined: 28 Oct 2012 Posts: 39
|
Posted: Sat Jul 06, 2013 2:07 am Post subject: |
|
|
wcg wrote: | I find the fact that the old kernel output reports the onboard SAS
hosts as iscsi and the new one reports them as ahci suspicious. |
I think they are still being detected correctly, just in a different order -- you can see the ocz-vertex drives end up on hosts 2 and 3 in the new kernel instead of 0 and 1 like in the old kernel |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Sat Jul 06, 2013 2:41 am Post subject: |
|
|
[edit:] Oh, I see what you are saying, the iscsi devices are
hosts 0 and 1 when the new kernel boots. So, if it is using
the correct device driver for those SAS host interfaces,
I have no idea why the kernel would fail to detect the
connected drives at boot. And both kernels detect the
same drives connected to the mpt2sas controller, so
the problem seems to be specific to the driver for the
C606 SAS controller rather than to all SAS controllers.
I still think you need to consult the mailing list for
bug/patch reports.[/edit]
[deleted blather that reflected not consulting the two pastebins
for clarification.]
Anyway, you can search the mailing list for posts
with keywords "C606 ISCSI missing drives". Someone else may
have already reported your problem and newer kernels
may already have a boot probe device detection patch for it. _________________ TIA |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Sat Jul 06, 2013 4:31 am Post subject: |
|
|
PS:
I hesitate to mention this, it is becoming such a cliche these
days of aggressive power saving invading every aspect of
the operation of our hardware, but I wonder if on-board
acpi code is spinning down the drives before the kernel can
detect them. It would be a bizarre kind of error that boot
code could easily work around if it knew that could happen.
Something to keep in mind if you don't come across any
other reports of the same problem on the same hardware.
(Not very likely. A change in the kernel like this usually arrives
with more than one report on various mailing lists.) _________________ TIA |
|
Back to top |
|
|
wcg Guru
Joined: 06 Jan 2009 Posts: 588
|
Posted: Sat Jul 06, 2013 1:05 pm Post subject: |
|
|
[edit:]After some www browsing and rechecking /usr/src/linux/.config, I see
that the C600 SAS driver is actually "ISCI" rather than "ISCSI". That makes
searching for bug reports a little more effective, since one can exclude
"iscsi" hits.[/edit]
[obsolete comments]
When searching for relevant bug reports, one needs to look
closely to be sure the report is relevant to the ISCSI driver
("Intel SCSI" for the intel SAS controllers) and not "iSCSI",
an "scsi over networks" protocol that predates the intel
hardware driver. ( http://linux-iscsi.org/wiki/Main_Page )
(I did not see anything relevant in the linux-acpi bugzilla
or mailing list.)
[/obs]
[edit:]
This a very new driver, in terms of mainline kernel integration.
Intel had an internal version sometime before, but the source
in the public kernels was first introduced into 3.0-rc6, with a
big set of patches merged in 3.7-rc6. I would have expected
it to work better in a 3.8.x kernel than a 3.5.x kernel, but
your results demonstrate that would be overly optimistic.
Even using the correct search term ("isci"), I still do not find
anything relevant to that SAS driver in the linux-acpi bugzilla
and mailing list. That does not necessarily mean that acpi is
not the real problem, only that no one else has identified it as
such. ( https://01.org/linux-acpi )
With luck, SCSI_LOGGING turns up some useful information.
[/edit] _________________ TIA |
|
Back to top |
|
|
RAPHEAD Tux's lil' helper
Joined: 20 Jun 2003 Posts: 134 Location: Germany
|
Posted: Sun Apr 12, 2015 5:18 pm Post subject: |
|
|
Hi,
did you found out anything regarding this?
I'm also having problems with a c600 based system in combination with
an Intel RKSAS4 Upgrade Key to enable SAS and recent Linux Kernels.
My problem is that I cannot make use of a big drive (4TB) while smaller SAS drives work.
Funnily it works with a very old SuSE Linux Enterprise 11 SP2 with Kernel 3.0.xx.
But altough this kernel also has the ISCI module, it does not get loaded (lsmod does not show it).
Greets |
|
Back to top |
|
|
|