View previous topic :: View next topic |
Author |
Message |
Madhax n00b
Joined: 24 Aug 2018 Posts: 2
|
Posted: Fri Aug 24, 2018 9:42 am Post subject: Trouble getting OpenCL to work w/ nvidia driver (gtx680) |
|
|
My problem is that I can't get any cuda application to run.
Code: | anigma /etc/OpenCL/vendors # clinfo
Number of platforms 0 |
I've had OpenCL/CUDA working before, but I think it was with binary drivers straight from Nvidia's website. I have since wiped all old libraries belonging to nvidia + opencl because there had been a conflict while doing a sync update, can't remember what it was but it was a long time ago. So I thought I'd let portage deal with the headache of making sure everything works nicely.
profile
default/linux/amd64/17.0/desktop/gnome/systemd (stable) *
I have virtual/opencl, nvidia-drivers and nvidia-cuda-toolkit installed
output of eselect opencl list:
Code: | Available OpenCL implementations:
[1] nvidia * |
Code: | anigma /etc/OpenCL/vendors # cat nvidia.icd
libnvidia-opencl.so.1 |
That file exists both in /usr/lib32/ and /usr/lib64 and it's symlinked to the proper nvidia driver, I've tried giving it an absolute path too.
Code: | anigma /etc/OpenCL/vendors # lsmod | grep nvidia
nvidia_uvm 689727 0
nvidia_drm 28576 7
nvidia_modeset 1050533 15 nvidia_drm
nvidia 13510111 612 nvidia_modeset,nvidia_uvm |
I thought it could be a conflict w/ multilib so I had set -abi_x86_32 for opencl + nvidia-drivers thinking it might work, but it didn't either.
Here's my complete make.conf - I could have added/removed global flags in between package installations but I've reran w/ changed-use.
Code: | # These settings were set by the catalyst build script that automatically
# built this stage.
# Please consult /usr/share/portage/config/make.conf.example for a more
# detailed example.
CFLAGS="-O2 -pipe"
CXXFLAGS="${CFLAGS}"
MAKEOPTS="-j8"
ABI_X86="32 64"
LDLFLAGS="-Wl, -O1 -Wl, --as-needed -Wl, --sort-common"
# WARNING: Changing your CHOST is not something that should be done lightly.
# Please consult http://www.gentoo.org/doc/en/change-chost.xml before changing.
CHOST="x86_64-pc-linux-gnu"
# These are the USE flags that were used in addition to what is provided by the
# profile used for building.
USE="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 apache2 php -nss xvmc gnutls dri hal dbus cdrtools -bindist mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3 apng jpeg gif tiff png svg flac mp3 ogg pulseaudio webp xvid x264 vaapi glamor icu corefonts truetype icu minizip opengl xcb libinput gnome systemd gpg ssl aes ogg X zlib gtk3 alsa mpeg curl pam nvidia cuda vdpau matroska -wxwidgets postscript x264 upnp threads -wayland"
CURL_SSL="openssl"
APACHE2_MPMS="event"
PHP_TARGETS="php7-1"
ACCEPT_LICENSE="*"
PORTAGE_NICENESS="5"
EMERGE_DEFAULT_OPTS="--quiet-build=y --keep-going --jobs=8 --load-average=9"
INPUT_DEVICES="libinput"
ACCEPT_KEYWORDS="amd64"
PORTDIR="/usr/portage"
DISTDIR="${PORTDIR}/distfiles"
PKGDIR="${PORTDIR}/packages"
PORTAGE_ELOG_CLASSES="log warn error"
PORTAGE_ELOG_SYSTEM="save"
VIDEO_CARDS="nvidia"
|
I would appreciate some guidance on what could be the problem. There's no other issue, just for some reason CUDA can't be used atm. The only significant changes that occurred prior to it working and now is a kernel update and an emerge sync. I'm not exactly sure when it stopped working as I don't usually have a need for OpenCL. At this point I'm tempted to do an empty tree emerge of nvidia-cuda-toolkit to see what happens.
Thank you!
Edit:
Some more possibly relevant info
Code: | nigma /etc/OpenCL/vendors # eix nvidia-drivers
[?] x11-drivers/nvidia-drivers
Available versions: 304.137(0/304)^md 340.106(0/340)^md 375.82(0/375)^md 378.13-r1(0/378)^md 381.22-r1(0/381)^md 384.111(0/384)^md 387.34(0/387)^md 390.42(0/390)^md {+X acpi compat +driver gtk3 +kms multilib pax_kernel static-libs +tools uvm wayland ABI_MIPS="n32 n64 o32" ABI_PPC="32 64" ABI_S390="32 64" ABI_X86="32 64 x32" KERNEL="FreeBSD linux"}
Installed versions: 396.45(0/396)^md(05:47:04 AM 08/24/2018)(X acpi driver gtk3 kms multilib static-libs tools uvm -compat -pax_kernel -wayland ABI_MIPS="-n32 -n64 -o32" ABI_PPC="-32 -64" ABI_S390="-32 -64" ABI_X86="32 64 -x32" KERNEL="linux -FreeBSD")
Homepage: http://www.nvidia.com/ http://www.nvidia.com/Download/Find.aspx
Description: NVIDIA Accelerated Graphics Driver
anigma /etc/OpenCL/vendors # eix virtual/opencl
[?] virtual/opencl
Available versions: 0-r4 ~0-r5 {ABI_MIPS="n32 n64 o32" ABI_PPC="32 64" ABI_S390="32 64" ABI_X86="32 64 x32" VIDEO_CARDS="amdgpu i965 nvidia"}
Installed versions: 0-r5(05:47:15 AM 08/24/2018)(ABI_MIPS="-n32 -n64 -o32" ABI_PPC="-32 -64" ABI_S390="-32 -64" ABI_X86="32 64 -x32" VIDEO_CARDS="nvidia -amdgpu -i965")
Description: Virtual for OpenCL implementations
anigma /etc/OpenCL/vendors # eix media-libs/mesa
[?] media-libs/mesa
Available versions: 17.1.10^d 17.2.8^d ~17.3.6^d ~17.3.7^d ~18.0.0_rc4^d ~18.0.0_rc5^d **9999^d {bindist +classic d3d9 debug +dri3 +egl +gallium +gbm gles1 gles2 +llvm +nptl opencl openmax osmesa pax_kernel pic selinux unwind vaapi valgrind vdpau vulkan wayland xa xvmc ABI_MIPS="n32 n64 o32" ABI_PPC="32 64" ABI_S390="32 64" ABI_X86="32 64 x32" VIDEO_CARDS="freedreno i915 i965 imx intel nouveau r100 r200 r300 r600 radeon radeonsi vc4 virgl vivante vmware"}
Installed versions: 17.3.9^d(02:10:39 AM 08/24/2018)(classic dri3 egl gallium gbm gles2 llvm nptl vaapi vdpau xvmc -bindist -d3d9 -debug -gles1 -opencl -openmax -osmesa -pax_kernel -pic -selinux -unwind -valgrind -vulkan -wayland -xa ABI_MIPS="-n32 -n64 -o32" ABI_PPC="-32 -64" ABI_S390="-32 -64" ABI_X86="32 64 -x32" VIDEO_CARDS="-freedreno -i915 -i965 -imx -intel -nouveau -r100 -r200 -r300 -r600 -radeon -radeonsi -vc4 -virgl -vivante -vmware")
Homepage: https://www.mesa3d.org/ https://mesa.freedesktop.org/
Description: OpenGL-like graphic library for Linux
|
Cheers |
|
Back to top |
|
|
Madhax n00b
Joined: 24 Aug 2018 Posts: 2
|
Posted: Sun Aug 26, 2018 3:54 am Post subject: |
|
|
So I rebuilt kernel, rebuilt cuda toolkit w/ empty tree. Scoured through forum topics making sure I got configs.
Turns out it was a bug w/ nvidia driver start w/ 396.45 (which is marked as stable on portage btw)
Relevant forum topic
https://devtalk.nvidia.com/default/topic/1037521/cuda-broken-in-396-24-02-and-396-24-10-vulkan-beta-drivers-on-linux/
So I just masked the latest not-so-"stable" release
Hope this saves someone else some time
edit:
I unmasked the latest stable and added kernel support for "Numa Memory Allocation and Scheduler Support"
Code: | CONFIG_NUMA=y
CONFIG_AMD_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_NUMA_EMU=y
CONFIG_USE_PERCPU_NUMA_NODE_ID=y
CONFIG_ACPI_NUMA=y |
that fixed the problem too. |
|
Back to top |
|
|
luciano Tux's lil' helper
Joined: 18 Nov 2004 Posts: 132
|
Posted: Mon Apr 13, 2020 10:03 am Post subject: |
|
|
This was really helpful thanks.
As per your link to the Nvidia forum, the actual options needed are just CONFI_CPU_SETS and CONFIG_NUMA.
Also, you must turn on the uvm flag in the nvidia-drivers ebuild. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|