View previous topic :: View next topic |
Author |
Message |
ahhzee n00b
Joined: 16 Jul 2021 Posts: 8
|
Posted: Fri Nov 26, 2021 1:08 am Post subject: amdgpu gpu reset - freeze/crash |
|
|
When playing a game I've been having a consistant issue of my GPU resetting itself randomly as I play. This happens in a few games, both with native and non-native builds (played thru steam proton).
It results in a frozen/black screen with randomly colored pixels dotting a few places, very buggy.
I have no idea what causes it, and it seems to be semi-common and still present for others (https://bugzilla.kernel.org/show_bug.cgi?id=201957).
Below is from my /var/log/messages during/after the crash while playing Risk of Rain 2.
Code: |
Nov 25 18:20:15 zmaj kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Nov 25 18:20:15 zmaj kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Nov 25 18:20:15 zmaj kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=1348774, emitted seq=1348776
Nov 25 18:20:15 zmaj kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Risk of Rain 2. pid 16876 thread dxvk-submit pid 16923
Nov 25 18:20:15 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: GPU reset begin!
Nov 25 18:20:17 zmaj kernel: [drm] REG_WAIT timeout 1us * 200 tries - hubp2_set_blank line:956
Nov 25 18:20:17 zmaj kernel: [drm] REG_WAIT timeout 1us * 200 tries - hubp2_set_blank line:956
Nov 25 18:20:17 zmaj kernel: amdgpu 0000:28:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Nov 25 18:20:17 zmaj kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
Nov 25 18:20:17 zmaj kernel: amdgpu 0000:28:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Nov 25 18:20:17 zmaj kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Nov 25 18:20:18 zmaj kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Nov 25 18:20:18 zmaj kernel: [drm] free PSP TMR buffer
Nov 25 18:20:18 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: BACO reset
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: GPU reset succeeded, trying to resume
Nov 25 18:20:20 zmaj kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
Nov 25 18:20:20 zmaj kernel: [drm] VRAM is lost due to GPU reset!
Nov 25 18:20:20 zmaj kernel: [drm] PSP is resuming...
Nov 25 18:20:20 zmaj kernel: [drm] reserve 0x900000 from 0x817e400000 for PSP TMR
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: RAS: optional ras ta ucode is not available
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: RAP: optional rap ta ucode is not available
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: SMU is resuming...
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: smu driver if version = 0x00000036, smu fw if version = 0x00000037, smu fw version = 0x002a4000 (42.64.0)
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: SMU driver if version not matched
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: SMU is resumed successfully!
Nov 25 18:20:20 zmaj kernel: [drm] kiq ring mec 2 pipe 1 q 0
Nov 25 18:20:20 zmaj kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Nov 25 18:20:20 zmaj kernel: [drm] JPEG decode initialized successfully.
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring vcn_dec uses VM inv eng 0 on hub 1
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 1 on hub 1
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 4 on hub 1
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: recover vram bo from shadow start
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: recover vram bo from shadow done
Nov 25 18:20:20 zmaj kernel: [drm] Skip scheduling IBs!
Nov 25 18:20:20 zmaj kernel: [drm] Skip scheduling IBs!
Nov 25 18:20:20 zmaj kernel: [drm] Skip scheduling IBs!
Nov 25 18:20:20 zmaj kernel: [drm] Skip scheduling IBs!
Nov 25 18:20:20 zmaj kernel: amdgpu 0000:28:00.0: amdgpu: GPU reset(2) succeeded!
Nov 25 18:20:20 zmaj kernel: [drm] Skip scheduling IBs!
Nov 25 18:20:20 zmaj kernel: [drm] Skip scheduling IBs!
Nov 25 18:20:20 zmaj kernel: [drm] Skip scheduling IBs!
<repeat of above for nearly 200 lines>
Nov 25 18:20:20 zmaj kernel: [drm] Skip scheduling IBs!
Nov 25 18:20:20 zmaj kernel: [drm] Skip scheduling IBs!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:20 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:20:35 zmaj kernel: amdgpu_cs_ioctl: 6 callbacks suppressed
Nov 25 18:20:35 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:21:03 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Nov 25 18:21:03 zmaj kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
<repeat until I use my keybind to reboot>
|
My WM is still resonsive, and if I reboot or exit the WM I will be able to effortlessly startx again (sans some audio glitches). The screen won't update until X is restarted.
If there is a known cause or fix to this I'd be very happy to know, even if it requires patching or old firmware versions. |
|
Back to top |
|
|
alamahant Advocate
Joined: 23 Mar 2019 Posts: 3879
|
|
Back to top |
|
|
ahhzee n00b
Joined: 16 Jul 2021 Posts: 8
|
Posted: Fri Nov 26, 2021 3:47 pm Post subject: |
|
|
Thank you for the reply.
I tried with version 20210208 and it still didn't work. Same crash/errors as before.
I use nither KDE or SDDM, rather dwm and startx.
The kernerl peramiters I have are Code: | root=/dev/sda2 ro amdgpu.noretry=0 |
Manually setting '/sys/class/drm/card0/device/power_dpm_force_performance_level' to high as posted on the manjaro forums doesn't fix the issue |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|