Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] AMD ROCM installed but clinfo -l says "0"
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Goverp
Veteran
Veteran


Joined: 07 Mar 2007
Posts: 1993

PostPosted: Thu Oct 07, 2021 3:49 pm    Post subject: [SOLVED] AMD ROCM installed but clinfo -l says "0" Reply with quote

I should know better at my age :-)

I wanted some shiny stuff, so I thought I'd get OpenCL working via ROCM. It used to work, as far as it went, under mesa:
Code:

ryzen ~ # clinfo
Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 21.1.7
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     Radeon RX 570 Series (POLARIS10, DRM 3.42.0, 5.14.9-gentoo-clang, LLVM 12.0.1)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 21.1.7
  Device Numeric Version                          0x401000 (1.1.0)
  Driver Version                                  21.1.7
  Device OpenCL C Version                         OpenCL C 1.1
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Max compute units                               32
  Max clock frequency                             1280MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple (kernel)     64
  Preferred / native vector sizes                 
blah blah blah

So I followed the instructions on the wiki, updated my kernel, stuffed the necessary bunch of entries into accept_keywords, and installed rocm, and disabled mesa's opencl.
Code:
dev-libs/rocclr
dev-libs/rocm-comgr
dev-libs/rocm-device-libs
dev-libs/rocm-opencl-runtime
dev-libs/rocr-runtime
dev-libs/roct-thunk-interface
dev-util/clinfo
dev-util/rocm-cmake
dev-util/rocminfo
sys-devel/llvm-roc

But:
Code:
ryzen ~ # clinfo
Number of platforms                               0

and "emerge rocminfo" dies a nasty death:
Code:
FAILED: rocminfo
: && /usr/bin/x86_64-pc-linux-gnu-g++ -std=c++11  -fexceptions -fno-rtti -fno-math-errno -fno-threadsafe-statics -fmerge-all-constants -fms-extensions -Werror -Wall -m64 -msse -msse2 -ggdb -O0 -Wl,-O1 -Wl,--as-needed CMakeFiles/rocminfo.dir/rocminfo.cc.o -o rocminfo  /usr/lib64/libhsa-runtime64.so.1.3.0 && :
/usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: warning: libhsakmt.so.1, needed by /usr/lib64/libhsa-runtime64.so.1.3.0, not found (try using -rpath or -rpath-link)
..
/usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: /usr/lib64/libhsa-runtime64.so.1.3.0: undefined reference to `hsaKmtDestroyEvent@HSAKMT_1'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.


Thanks for any suggestions as to whats' missing, or should I revert to mesa?
_________________
Greybeard


Last edited by Goverp on Thu Oct 14, 2021 2:30 pm; edited 1 time in total
Back to top
View user's profile Send private message
roccobaroccoSC
n00b
n00b


Joined: 15 May 2020
Posts: 27

PostPosted: Thu Oct 07, 2021 10:04 pm    Post subject: Reply with quote

Hi, I have rocminfo working on a Threadripper 1 and Radeon RX 580. We can compare configs.

Here is some of my config:
Code:
% equery list 'roc*'
 * Searching for roc* ...
[IP-] [  ] dev-libs/rocclr-4.3.0:0/4.3
[IP-] [  ] dev-libs/rocm-comgr-4.3.0:0/4.3
[IP-] [  ] dev-libs/rocm-device-libs-4.3.0:0/4.3
[IP-] [  ] dev-libs/rocm-opencl-runtime-4.3.0:0/4.3
[IP-] [  ] dev-libs/rocr-runtime-4.3.0:0/4.3
[IP-] [  ] dev-libs/roct-thunk-interface-4.3.0:0/4.3
[IP-] [  ] dev-util/rocm-cmake-4.3.0:0/4.3
[IP-] [  ] dev-util/rocminfo-4.3.0:0/4.3
% equery list 'clinfo*'
 * Searching for clinfo* ...
[IP-] [  ] dev-util/clinfo-3.0.21.02.21:0
% equery list 'llvm-roc*'
 * Searching for llvm-roc* ...
[IP-] [  ] sys-devel/llvm-roc-4.3.0-r1:0
% equery list 'mesa'
 * Searching for mesa ...
[IP-] [  ] media-libs/mesa-21.1.7:0
% cat /etc/portage/package.use/mesa 
# required by mesa[opencl,lm-sensors,vulkan,xa,xvmc,vaapi] (argument)
>=media-libs/mesa-19.1.7 vaapi vulkan lm-sensors opencl


Code:
% clinfo
KFD does not support xnack mode query.
ROCr must assume xnack is disabled.
LoadLib(libhsa-amd-aqlprofile64.so) failed: libhsa-amd-aqlprofile64.so: cannot open shared object file: No such file or directory
Number of platforms                               2
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 21.1.7
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.0 AMD-APP.dbg (3305.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback
  Platform Extensions function suffix             AMD

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     Radeon RX 580 Series (POLARIS10, DRM 3.40.0, 5.10.61-gentoo-mbax, LLVM 12.0.1)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 21.1.7
  Device Numeric Version                          0x401000 (1.1.0)
  Driver Version                                  21.1.7
  Device OpenCL C Version                         OpenCL C 1.1
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Max compute units                               36
  Max clock frequency                             1430MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple (kernel)     64
  Preferred / native vector sizes                 
    char                                                16 / 16     
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              8589934592 (8GiB)
  Error Correction support                        No
  Max memory allocation                           6871947673 (6.4GiB)
  Unified memory for Host and Device              No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       32768 bits (4096 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max number of constant args                     16
  Max constant buffer size                        67108864 (64MiB)
  Max size of kernel argument                     1024
  Queue properties                               
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                         
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    ILs with version                              (n/a)
  Built-in kernels with version                   (n/a)
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_extended_versioning
  Device Extensions with Version                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_int64_base_atomics                                        0x400000 (1.0.0)
                                                  cl_khr_int64_extended_atomics                                    0x400000 (1.0.0)
                                                  cl_khr_fp64                                                      0x400000 (1.0.0)
                                                  cl_khr_extended_versioning                                       0x400000 (1.0.0)

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     gfx803
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.2
  Driver Version                                  3305.0 (HSA1.1,LC)
  Device OpenCL C Version                         OpenCL C 2.0
  Device Type                                     GPU
  Device Board Name (AMD)                         Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
  Device PCI-e ID (AMD)                           0x67df
  Device Topology (AMD)                           PCI-E, 0000:42:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               36
  SIMD per compute unit (AMD)                     4
  SIMD width (AMD)                                16
  SIMD instruction width (AMD)                    1
  Max clock frequency                             1430MHz
  Graphics IP (AMD)                               8.0
  Device Partition                                (core)
    Max number of sub-devices                     36
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             256
  Preferred work group size (AMD)                 256
  Max work group size (AMD)                       1024
[1]    11883 segmentation fault  clinfo
Back to top
View user's profile Send private message
larrys
Tux's lil' helper
Tux's lil' helper


Joined: 20 Jul 2020
Posts: 81
Location: New Jersey

PostPosted: Fri Oct 08, 2021 12:06 pm    Post subject: Reply with quote

Goverp,
It looks like the file is provided by dev-libs/roct-thunk-interface. Maybe there was a problem with the build for that?
Code:
$ e-file libhsakmt.so.1
 *  dev-libs/roct-thunk-interface
   Available Versions:   2.9.0-r1 3.0.0 3.1.0 3.3.0 3.5.0 3.6.0 3.7.0 3.8.0 3.9.0 3.10.0 4.0.0 4.1.0 4.2.0 4.3.0
   Homepage:      https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface
   Description:      Radeon Open Compute Thunk Interface
   Matched Files:      /usr/lib64/libhsakmt.so.1;

 *  dev-db/myodbc
   Available Versions:   8.0.22
   Homepage:      https://dev.mysql.com/downloads/connector/odbc/
   Description:      ODBC driver for MySQL
   Matched Files:      /usr/lib64/myodbc-8.0/private/libhsakmt.so.1; /usr/lib/myodbc-8.0/private/libhsakmt.so.1;
Back to top
View user's profile Send private message
Goverp
Veteran
Veteran


Joined: 07 Mar 2007
Posts: 1993

PostPosted: Fri Oct 08, 2021 4:42 pm    Post subject: Reply with quote

larrys, roccobaroccoSC,

Thanks, progress!

I use a binpkg server, and roct-thunk-interface and rocm-cmake were not installed, as they are down as build-time dependencies.

It still isn't right; the "AMD Accelerated Parallel Processing" platform appears, but with 0 devices.
The only package difference I can see is my mesa omits USE=vulkan, but IIUC that's not required.

I've checked the wiki OpenCL page, and have the specified kernel config options:
Code:
Memory Management options
 │                     [*] Allow for memory hot-add                                                                │
 │                     [ ]   Online the newly added memory blocks by default                                       │
 │                     [*] Allow for memory hot remove
     ...
 │                     [*] Device memory (pmem, HMM, etc...) hotplug support
Graphics support
<*> AMD GPU                                                                                    │
 │                  [ ]   Enable amdgpu support for SI parts                                                       │
 │                  [ ]   Enable amdgpu support for CIK parts                                                      │
 │                  -*-   Always enable userptr write support                                                      │
 │                        ACP (Audio CoProcessor) Configuration  --->                                              │
 │                        Display Engine Configuration  --->                                                       │
 │                  [*]   HSA kernel driver for AMD GPU devices                                                    │
 │                  [*]     Enable HMM-based shared virtual memory manager

Here's the clinfo output for the AMD bit:
Code:
Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 0

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContext(NULL, ...) [other]             
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 Clover
    Device Name                                   Radeon RX 570 Series (POLARIS10, DRM 3.42.0, 5.14.9-gentoo-clang, LLVM 12.0.1)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Clover
    Device Name                                   Radeon RX 570 Series (POLARIS10, DRM 3.42.0, 5.14.9-gentoo-clang, LLVM 12.0.1)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Clover
    Device Name                                   Radeon RX 570 Series (POLARIS10, DRM 3.42.0, 5.14.9-gentoo-clang, LLVM 12.0.1)

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.12
  ICD loader Profile                              OpenCL 2.2

_________________
Greybeard
Back to top
View user's profile Send private message
roccobaroccoSC
n00b
n00b


Joined: 15 May 2020
Posts: 27

PostPosted: Tue Oct 12, 2021 11:30 pm    Post subject: Reply with quote

Hi, I can report some progress on the segmentation fault. I was able to solve it.
For some strange reason the Wiki explains, one has to disable the "opencl" USE flag of media-libs/mesa. Why? No further information is given.
Take a look: https://wiki.gentoo.org/wiki/OpenCL#Intel_-_CPU
Quote:
For an error-free operation, it may be necessary to recompile media-libs/mesa with the -opencl USE flag


So I removed the opencl USE flag and recompiled mesa. clinfo now does not crash anymore.
RTFM, but I don't understand the sense of it.

:?
Back to top
View user's profile Send private message
Goverp
Veteran
Veteran


Joined: 07 Mar 2007
Posts: 1993

PostPosted: Thu Oct 14, 2021 2:29 pm    Post subject: Reply with quote

Solved it!

A bit of Googling showed that the syslog/dmesg entry:
Code:
kfd kfd: amdgpu: skipped device 1002:67df, PCI rejects atomics 730<0

was significant.

Apparently the two PCIe slots on my ASUS X570 PLUS Gaming motherboard are not identical; IIUC the lower one goes via the south bridge, and that breaks PCIe atomics, and apparently they are crucial.
Originally my RX470 had been in the upper slot, but I moved it to the other because it blocked a cooling fan on the motherboard. Mistake. Moving it back, the log message is now:
Code:
kfd kfd: amdgpu: Allocated 3969056 bytes on gart

and clinfo and rocfminfo both show the sort of output I'd expect.

Now all I need do is find something to use it for !
_________________
Greybeard
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum