View previous topic :: View next topic |
Author |
Message |
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2003
|
Posted: Thu Oct 07, 2021 3:49 pm Post subject: [SOLVED] AMD ROCM installed but clinfo -l says "0" |
|
|
I should know better at my age
I wanted some shiny stuff, so I thought I'd get OpenCL working via ROCM. It used to work, as far as it went, under mesa: Code: |
ryzen ~ # clinfo
Number of platforms 1
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 21.1.7
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA
Platform Name Clover
Number of devices 1
Device Name Radeon RX 570 Series (POLARIS10, DRM 3.42.0, 5.14.9-gentoo-clang, LLVM 12.0.1)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 Mesa 21.1.7
Device Numeric Version 0x401000 (1.1.0)
Driver Version 21.1.7
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Max compute units 32
Max clock frequency 1280MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple (kernel) 64
Preferred / native vector sizes
blah blah blah |
So I followed the instructions on the wiki, updated my kernel, stuffed the necessary bunch of entries into accept_keywords, and installed rocm, and disabled mesa's opencl.
Code: | dev-libs/rocclr
dev-libs/rocm-comgr
dev-libs/rocm-device-libs
dev-libs/rocm-opencl-runtime
dev-libs/rocr-runtime
dev-libs/roct-thunk-interface
dev-util/clinfo
dev-util/rocm-cmake
dev-util/rocminfo
sys-devel/llvm-roc |
But:
Code: | ryzen ~ # clinfo
Number of platforms 0 |
and "emerge rocminfo" dies a nasty death:
Code: | FAILED: rocminfo
: && /usr/bin/x86_64-pc-linux-gnu-g++ -std=c++11 -fexceptions -fno-rtti -fno-math-errno -fno-threadsafe-statics -fmerge-all-constants -fms-extensions -Werror -Wall -m64 -msse -msse2 -ggdb -O0 -Wl,-O1 -Wl,--as-needed CMakeFiles/rocminfo.dir/rocminfo.cc.o -o rocminfo /usr/lib64/libhsa-runtime64.so.1.3.0 && :
/usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: warning: libhsakmt.so.1, needed by /usr/lib64/libhsa-runtime64.so.1.3.0, not found (try using -rpath or -rpath-link)
..
/usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: /usr/lib64/libhsa-runtime64.so.1.3.0: undefined reference to `hsaKmtDestroyEvent@HSAKMT_1'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
|
Thanks for any suggestions as to whats' missing, or should I revert to mesa? _________________ Greybeard
Last edited by Goverp on Thu Oct 14, 2021 2:30 pm; edited 1 time in total |
|
Back to top |
|
|
roccobaroccoSC n00b
Joined: 15 May 2020 Posts: 27
|
Posted: Thu Oct 07, 2021 10:04 pm Post subject: |
|
|
Hi, I have rocminfo working on a Threadripper 1 and Radeon RX 580. We can compare configs.
Here is some of my config:
Code: | % equery list 'roc*'
* Searching for roc* ...
[IP-] [ ] dev-libs/rocclr-4.3.0:0/4.3
[IP-] [ ] dev-libs/rocm-comgr-4.3.0:0/4.3
[IP-] [ ] dev-libs/rocm-device-libs-4.3.0:0/4.3
[IP-] [ ] dev-libs/rocm-opencl-runtime-4.3.0:0/4.3
[IP-] [ ] dev-libs/rocr-runtime-4.3.0:0/4.3
[IP-] [ ] dev-libs/roct-thunk-interface-4.3.0:0/4.3
[IP-] [ ] dev-util/rocm-cmake-4.3.0:0/4.3
[IP-] [ ] dev-util/rocminfo-4.3.0:0/4.3
% equery list 'clinfo*'
* Searching for clinfo* ...
[IP-] [ ] dev-util/clinfo-3.0.21.02.21:0
% equery list 'llvm-roc*'
* Searching for llvm-roc* ...
[IP-] [ ] sys-devel/llvm-roc-4.3.0-r1:0
% equery list 'mesa'
* Searching for mesa ...
[IP-] [ ] media-libs/mesa-21.1.7:0
% cat /etc/portage/package.use/mesa
# required by mesa[opencl,lm-sensors,vulkan,xa,xvmc,vaapi] (argument)
>=media-libs/mesa-19.1.7 vaapi vulkan lm-sensors opencl
|
Code: | % clinfo
KFD does not support xnack mode query.
ROCr must assume xnack is disabled.
LoadLib(libhsa-amd-aqlprofile64.so) failed: libhsa-amd-aqlprofile64.so: cannot open shared object file: No such file or directory
Number of platforms 2
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 21.1.7
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.0 AMD-APP.dbg (3305.0)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback
Platform Extensions function suffix AMD
Platform Name Clover
Number of devices 1
Device Name Radeon RX 580 Series (POLARIS10, DRM 3.40.0, 5.10.61-gentoo-mbax, LLVM 12.0.1)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 Mesa 21.1.7
Device Numeric Version 0x401000 (1.1.0)
Driver Version 21.1.7
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Max compute units 36
Max clock frequency 1430MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple (kernel) 64
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 0 (n/a)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 8589934592 (8GiB)
Error Correction support No
Max memory allocation 6871947673 (6.4GiB)
Unified memory for Host and Device No
Minimum alignment for any data type 128 bytes
Alignment of base address 32768 bits (4096 bytes)
Global Memory cache type None
Image support No
Local memory type Local
Local memory size 32768 (32KiB)
Max number of constant args 16
Max constant buffer size 67108864 (64MiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 0ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
ILs with version (n/a)
Built-in kernels with version (n/a)
Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_extended_versioning
Device Extensions with Version cl_khr_byte_addressable_store 0x400000 (1.0.0)
cl_khr_global_int32_base_atomics 0x400000 (1.0.0)
cl_khr_global_int32_extended_atomics 0x400000 (1.0.0)
cl_khr_local_int32_base_atomics 0x400000 (1.0.0)
cl_khr_local_int32_extended_atomics 0x400000 (1.0.0)
cl_khr_int64_base_atomics 0x400000 (1.0.0)
cl_khr_int64_extended_atomics 0x400000 (1.0.0)
cl_khr_fp64 0x400000 (1.0.0)
cl_khr_extended_versioning 0x400000 (1.0.0)
Platform Name AMD Accelerated Parallel Processing
Number of devices 1
Device Name gfx803
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 1.2
Driver Version 3305.0 (HSA1.1,LC)
Device OpenCL C Version OpenCL C 2.0
Device Type GPU
Device Board Name (AMD) Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
Device PCI-e ID (AMD) 0x67df
Device Topology (AMD) PCI-E, 0000:42:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 36
SIMD per compute unit (AMD) 4
SIMD width (AMD) 16
SIMD instruction width (AMD) 1
Max clock frequency 1430MHz
Graphics IP (AMD) 8.0
Device Partition (core)
Max number of sub-devices 36
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 256
Preferred work group size (AMD) 256
Max work group size (AMD) 1024
[1] 11883 segmentation fault clinfo
|
|
|
Back to top |
|
|
larrys Tux's lil' helper
Joined: 20 Jul 2020 Posts: 81 Location: New Jersey
|
Posted: Fri Oct 08, 2021 12:06 pm Post subject: |
|
|
Goverp,
It looks like the file is provided by dev-libs/roct-thunk-interface. Maybe there was a problem with the build for that? Code: | $ e-file libhsakmt.so.1
* dev-libs/roct-thunk-interface
Available Versions: 2.9.0-r1 3.0.0 3.1.0 3.3.0 3.5.0 3.6.0 3.7.0 3.8.0 3.9.0 3.10.0 4.0.0 4.1.0 4.2.0 4.3.0
Homepage: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface
Description: Radeon Open Compute Thunk Interface
Matched Files: /usr/lib64/libhsakmt.so.1;
* dev-db/myodbc
Available Versions: 8.0.22
Homepage: https://dev.mysql.com/downloads/connector/odbc/
Description: ODBC driver for MySQL
Matched Files: /usr/lib64/myodbc-8.0/private/libhsakmt.so.1; /usr/lib/myodbc-8.0/private/libhsakmt.so.1;
|
|
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2003
|
Posted: Fri Oct 08, 2021 4:42 pm Post subject: |
|
|
larrys, roccobaroccoSC,
Thanks, progress!
I use a binpkg server, and roct-thunk-interface and rocm-cmake were not installed, as they are down as build-time dependencies.
It still isn't right; the "AMD Accelerated Parallel Processing" platform appears, but with 0 devices.
The only package difference I can see is my mesa omits USE=vulkan, but IIUC that's not required.
I've checked the wiki OpenCL page, and have the specified kernel config options:
Code: | Memory Management options
│ [*] Allow for memory hot-add │
│ [ ] Online the newly added memory blocks by default │
│ [*] Allow for memory hot remove
...
│ [*] Device memory (pmem, HMM, etc...) hotplug support
Graphics support
<*> AMD GPU │
│ [ ] Enable amdgpu support for SI parts │
│ [ ] Enable amdgpu support for CIK parts │
│ -*- Always enable userptr write support │
│ ACP (Audio CoProcessor) Configuration ---> │
│ Display Engine Configuration ---> │
│ [*] HSA kernel driver for AMD GPU devices │
│ [*] Enable HMM-based shared virtual memory manager |
Here's the clinfo output for the AMD bit:
Code: | Platform Name AMD Accelerated Parallel Processing
Number of devices 0
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Clover
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [MESA]
clCreateContext(NULL, ...) [default] Success [MESA]
clCreateContext(NULL, ...) [other]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Clover
Device Name Radeon RX 570 Series (POLARIS10, DRM 3.42.0, 5.14.9-gentoo-clang, LLVM 12.0.1)
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Clover
Device Name Radeon RX 570 Series (POLARIS10, DRM 3.42.0, 5.14.9-gentoo-clang, LLVM 12.0.1)
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Clover
Device Name Radeon RX 570 Series (POLARIS10, DRM 3.42.0, 5.14.9-gentoo-clang, LLVM 12.0.1)
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.12
ICD loader Profile OpenCL 2.2 |
_________________ Greybeard |
|
Back to top |
|
|
roccobaroccoSC n00b
Joined: 15 May 2020 Posts: 27
|
Posted: Tue Oct 12, 2021 11:30 pm Post subject: |
|
|
Hi, I can report some progress on the segmentation fault. I was able to solve it.
For some strange reason the Wiki explains, one has to disable the "opencl" USE flag of media-libs/mesa. Why? No further information is given.
Take a look: https://wiki.gentoo.org/wiki/OpenCL#Intel_-_CPU
Quote: | For an error-free operation, it may be necessary to recompile media-libs/mesa with the -opencl USE flag |
So I removed the opencl USE flag and recompiled mesa. clinfo now does not crash anymore.
RTFM, but I don't understand the sense of it.
|
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2003
|
Posted: Thu Oct 14, 2021 2:29 pm Post subject: |
|
|
Solved it!
A bit of Googling showed that the syslog/dmesg entry:
Code: | kfd kfd: amdgpu: skipped device 1002:67df, PCI rejects atomics 730<0 |
was significant.
Apparently the two PCIe slots on my ASUS X570 PLUS Gaming motherboard are not identical; IIUC the lower one goes via the south bridge, and that breaks PCIe atomics, and apparently they are crucial.
Originally my RX470 had been in the upper slot, but I moved it to the other because it blocked a cooling fan on the motherboard. Mistake. Moving it back, the log message is now:
Code: | kfd kfd: amdgpu: Allocated 3969056 bytes on gart |
and clinfo and rocfminfo both show the sort of output I'd expect.
Now all I need do is find something to use it for ! _________________ Greybeard |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|