robinelvin wrote:BJP wrote:robinelvin wrote:I changed the debug level to 2 and now I can see the issue:
Did you get this fixed? I found in my case that I had to explicitly set CUDAARCHS="75" to support my RTX 2060 Super, as without setting CUDAARCHS the ebuild was building the library with support only for very recent cards.
I reported a bug here:
https://bugs.gentoo.org/968549
It's been closed but it still isn't working for me. I have tried with your CUDAARCHS flag and that doesn't work either. Only 0.13.0 worked for me and now that's gone so my setup is broken now.
Funny thing for me is I think the fix put in for your bug is what broke it for me. When built with CUDAARCHS="all-major" libggml-cuda.so only had support for sm100 and sm120, leading my card to be filtered out with the same "filtering device which didn't fully initialize" message.
You can check what which architectures are supported on the currently installed build by running:
Code: Select all
% strings /usr/lib64/ollama/libggml-cuda.so | grep -E "sm_[0-9]+" | sort -u
.target sm_75
Identify the architecture of your card by running:
Code: Select all
% nvidia-smi --query-gpu=compute_cap --format=csv,noheader
7.5
Assuming you don't have the same card as me, there's probably a different number you would need in CUDAARCHS make.conf setting.