Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] Getting CUDA to work on Thinkpad W530
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
[n00b@localhost]
Apprentice
Apprentice


Joined: 30 Aug 2004
Posts: 266
Location: London, UK

PostPosted: Sat Apr 27, 2013 10:09 pm    Post subject: [SOLVED] Getting CUDA to work on Thinkpad W530 Reply with quote

I have a Thinkpad W530 with an integrated Intel i915 GPU and a discrete nVidia Quadro K2000M GPU. I have left Optimus enabled in the BIOS (UEFI) and set up X to use the i915. I have installed the nvidia-drivers and the CUDA toolkit but when I try to run deviceQuery from the SDK I get the following:

Code:

garyslaptop ~ # /opt/cuda/sdk/1_Utilities/deviceQuery/deviceQuery
/opt/cuda/sdk/1_Utilities/deviceQuery/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected


Occasionally the error is 10 (invalid device ordinal).

Googling for this error turns up a million blogs where somebody says they have the same problem and it is solved by following the advice in the CUDA Getting Started Guide for Linux.

Sometimes the nvidia module is automatically loaded, sometimes not, and sometimes the device nodes are created, sometimes not. If the module is not loaded then loading it manually sometimes gives an error (dmesg says the card is not supported by the driver version but according to the website and driver README it is). If the device nodes are not there, sometimes mknod fails (i.e. returns 0 but the device node is still missing). In fact, the only consistent behaviour is that it doesn't work!

Code:

garyslaptop ~ # modprobe nvidia
modprobe: ERROR: could not insert 'nvidia': No such device
garyslaptop ~ # lsmod | grep nvidia
nvidia               9149524  1
garyslaptop ~ # dmesg | tail
[ 3155.427927] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=none
[ 3155.428048] NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:0ffb)
NVRM: installed in this system is not supported by the 313.30
NVRM: NVIDIA Linux driver release.  Please see 'Appendix
NVRM: A - Supported NVIDIA GPU Products' in this release's
NVRM: README, available on the Linux driver download page
NVRM: at www.nvidia.com.
[ 3155.428070] nvidia: probe of 0000:01:00.0 failed with error -1
[ 3155.428117] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 3155.428120] NVRM: None of the NVIDIA graphics adapters were initialized!
garyslaptop ~ # mknod -m 660 /dev/nvidia0 c 195 0
garyslaptop ~ # echo $?
0
garyslaptop ~ # ls /dev/nvidia0
ls: cannot access /dev/nvidia0: No such file or directory
garyslaptop ~ # mknod -m 660 /dev/nvidiactl c 195 255
garyslaptop ~ # echo $?
0
garyslaptop ~ # ls /dev/nvidiactl
ls: cannot access /dev/nvidiactl: No such file or directory


Has anyone been able to get CUDA working on this laptop? I bought it specifically for doing CUDA development after my T61 died.

I have tried nvidia-driver versions 304.88, 313.30 and 319.12 with nvidia-cuda-sdk-5.0.35-r1 and nvidia-cuda-toolkit-5.0.35-r4 and all give the same results. I have also tried disabling the integrated GPU in the BIOS after which the laptop does not boot (i.e. GRUB2 freezes).


Last edited by [n00b@localhost] on Fri May 03, 2013 10:52 am; edited 1 time in total
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Sun Apr 28, 2013 8:33 pm    Post subject: Reply with quote

http://www.nvidia.com/object/linux-display-amd64-310.44-driver.html

seems to be the latest driver for that card
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
[n00b@localhost]
Apprentice
Apprentice


Joined: 30 Aug 2004
Posts: 266
Location: London, UK

PostPosted: Mon Apr 29, 2013 5:18 pm    Post subject: Reply with quote

I've tried 310.44 too with the same results. Support for the K2000M was first added in 304.22.

I've since found out more about the problem. It seems that on boot the BIOS keeps the nVidia card turned off until something uses it (this is the point of Optimus - only power up the nVidia GPU when needed to save power). I have installed bumblebee to turn the card on and off manually but after booting it only allows me to turn it on once and when I try to use it in any way (deviceQuery, nvidia-smi, optirun) the BIOS turns it off again (with the syslog error "GPU has fallen off the bus") and won't let anything turn it back on until a reboot.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6111
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Mon Apr 29, 2013 8:50 pm    Post subject: Reply with quote

I first wanted to suggest a bios update (http://support.lenovo.com/de_DE/research/hints-or-tips/detail.page?&DocID=HT073837) but since there nowhere changes related to your problem are mentioned

it probably would be superfluous (and even possibly dangerous)


I yet have to try bumblebee & the discrete graphics out under Linux - currently I have it completely turned off (T530)

does plugging the power source make a change ? is disabling the intel graphics driver entirely an option ?


does the following help ?:

https://forums.gentoo.org/viewtopic-p-7118702.html

http://nvnews.net/vbulletin/showthread.php?p=2576162#post2576162


additional links, references:

http://www.cyberciti.biz/faq/debian-ubuntu-rhel-fedora-linux-nvidia-nvrm-gpu-fallen-off-bus/ <-- so in case your problem persists, file a bug-report at nvnews.net

https://github.com/Bumblebee-Project/Bumblebee/issues/192

https://github.com/Bumblebee-Project/Bumblebee/issues/207
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
[n00b@localhost]
Apprentice
Apprentice


Joined: 30 Aug 2004
Posts: 266
Location: London, UK

PostPosted: Fri May 03, 2013 10:51 am    Post subject: Reply with quote

kernelOfTruth wrote:
I first wanted to suggest a bios update (http://support.lenovo.com/de_DE/research/hints-or-tips/detail.page?&DocID=HT073837) but since there nowhere changes related to your problem are mentioned

it probably would be superfluous (and even possibly dangerous)


Since this laptop is only a fortnight old I'm assuming it would have the latest BIOS updates when it came out of the factory.

kernelOfTruth wrote:
I yet have to try bumblebee & the discrete graphics out under Linux - currently I have it completely turned off (T530)

does plugging the power source make a change ? is disabling the intel graphics driver entirely an option ?


Despite only being a fortnight old I have already broken the charging circuit so can no longer run the laptop off a battery. There is a BIOS option to run the laptop with "Integrated graphics only", "Discrete graphics only" or "nVidia Optimus". With Integrated graphics only it works OK but obviously I don't have access to my nVidia card (it doesn't even show up in lspci). Discrete graphics only is strange as now the Intel GPU won't show up in lspci so obviously the display is being run by the nVidia card - but it doesn't stop the card from turning off! I also have to disable KMS to get it past GRUB. It does boot successfully but the screen doesn't redraw and I have to ssh in to reboot. KMS works fine with the Intel card.

kernelOfTruth wrote:
does the following help ?:

https://forums.gentoo.org/viewtopic-p-7118702.html

http://nvnews.net/vbulletin/showthread.php?p=2576162#post2576162


additional links, references:

http://www.cyberciti.biz/faq/debian-ubuntu-rhel-fedora-linux-nvidia-nvrm-gpu-fallen-off-bus/ <-- so in case your problem persists, file a bug-report at nvnews.net

https://github.com/Bumblebee-Project/Bumblebee/issues/192

https://github.com/Bumblebee-Project/Bumblebee/issues/207


My previous googling turned up most of those pages except the first f.g.o and nvnews.net links. My situation is slightly different to those as I didn't have a problem with interrupts although I did have to add "nomodeset" to the kernel command line to get SysRescCD to boot.
Enabling the four kernel options listed, however, (NO_HZ, RCU_FAST_NO_HZ, CALGARY_IOMMU and CALGARY_IOMMU_ENABLED_BY_DEFAULT) stops the "GPU has fallen off the bus" errors and the card turning off when I try to use it. I'm not entirely convinced the NO_HZ and RCU_FAST_NO_HZ options are needed though as they sound like they fix the interrupt problems I never had so I'll try a few kernels with and without those enabled.

I've just run deviceQuery from the CUDA SDK and my nVidia card is listed. The card also didn't turn itself off after using it. I've not tried getting bumblebee or optirun to work yet.

Thanks for all your help!
Back to top
View user's profile Send private message
[n00b@localhost]
Apprentice
Apprentice


Joined: 30 Aug 2004
Posts: 266
Location: London, UK

PostPosted: Fri May 03, 2013 12:50 pm    Post subject: Reply with quote

So - I just tried a couple of kernels with CONFIG_NO_HZ and CONFIG_RCU_FAST_NO_HZ disabled and everything broke again.

Seems like those options are needed after all.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum