View previous topic :: View next topic |
Author |
Message |
sdauth Guru
Joined: 19 Sep 2018 Posts: 569 Location: Ásgarðr
|
Posted: Fri Dec 15, 2023 7:06 pm Post subject: [SOLVED] mpv-0.37.0 crashes radeon driver (gpu reset) |
|
|
Hi,
I updated mpv from 0.36 to 0.37 and it crashes (and triggers gpu reset) consistently when going fullscreen (tried with different files) :
Code: | [ 310.629977] radeon 0000:00:01.0: ring 0 stalled for more than 10470msec
[ 310.629987] radeon 0000:00:01.0: GPU lockup (current fence id 0x00000000000003e6 last fence id 0x00000000000003f5 on ring 0)
[ 310.672137] radeon 0000:00:01.0: Saved 458 dwords of commands on ring 0.
[ 310.672208] radeon 0000:00:01.0: GPU softreset: 0x00000009
[ 310.672212] radeon 0000:00:01.0: GRBM_STATUS = 0xA04E3828
[ 310.672216] radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x08800007
[ 310.672219] radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007
[ 310.672223] radeon 0000:00:01.0: SRBM_STATUS = 0x20000040
[ 310.672280] radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000
[ 310.672283] radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000
[ 310.672286] radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x00018000
[ 310.672289] radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00010002
[ 310.672292] radeon 0000:00:01.0: R_008680_CP_STAT = 0x80030243
[ 310.672295] radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[ 310.672298] radeon 0000:00:01.0: R_00D834_DMA_STATUS_REG = 0x44C83D57
[ 310.672301] radeon 0000:00:01.0: VM_CONTEXT0_PROTECTION_FAULT_ADDR 0x00000000
[ 310.672304] radeon 0000:00:01.0: VM_CONTEXT0_PROTECTION_FAULT_STATUS 0x00000000
[ 310.672307] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000
[ 310.672310] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
[ 310.682831] radeon 0000:00:01.0: GRBM_SOFT_RESET=0x0000DF7B
[ 310.682904] radeon 0000:00:01.0: SRBM_SOFT_RESET=0x00010100
[ 310.684062] radeon 0000:00:01.0: GRBM_STATUS = 0x00003828
[ 310.684068] radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x00000007
[ 310.684073] radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007
[ 310.684077] radeon 0000:00:01.0: SRBM_STATUS = 0x20000040
[ 310.684135] radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000
[ 310.684138] radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000
[ 310.684142] radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x00000000
[ 310.684150] radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00000000
[ 310.684153] radeon 0000:00:01.0: R_008680_CP_STAT = 0x00000000
[ 310.684156] radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[ 310.684160] radeon 0000:00:01.0: R_00D834_DMA_STATUS_REG = 0x44C83D57
[ 310.684223] radeon 0000:00:01.0: GPU reset succeeded, trying to resume
[ 310.707992] [drm] PCIE GART of 1024M enabled (table at 0x00000000001D6000).
[ 310.708113] radeon 0000:00:01.0: WB enabled
[ 310.708119] radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000020000c00
[ 310.708499] radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000075a18
[ 310.718688] radeon 0000:00:01.0: failed VCE resume (-22).
[ 310.718693] radeon 0000:00:01.0: fence driver on ring 1 use gpu addr 0x0000000020000c04
[ 310.718697] radeon 0000:00:01.0: fence driver on ring 2 use gpu addr 0x0000000020000c08
[ 310.718700] radeon 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c
[ 310.718704] radeon 0000:00:01.0: fence driver on ring 4 use gpu addr 0x0000000020000c10
[ 310.737009] [drm] ring test on 0 succeeded in 2 usecs
[ 310.737019] [drm] ring test on 3 succeeded in 3 usecs
[ 310.737026] [drm] ring test on 4 succeeded in 3 usecs
[ 310.783071] [drm] ring test on 5 succeeded in 2 usecs
[ 310.803097] [drm] UVD initialized successfully.
[ 311.901908] [drm:r600_ib_test] *ERROR* radeon: fence wait timed out.
[ 311.901927] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-110).
[ 311.944524] radeon 0000:00:01.0: GPU softreset: 0x00000009
[ 311.944528] radeon 0000:00:01.0: GRBM_STATUS = 0xA04E3828
[ 311.944530] radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x08800007
[ 311.944532] radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007
[ 311.944534] radeon 0000:00:01.0: SRBM_STATUS = 0x20000040
[ 311.944589] radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000
[ 311.944591] radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000
[ 311.944593] radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x00018000
[ 311.944595] radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00000002
[ 311.944596] radeon 0000:00:01.0: R_008680_CP_STAT = 0x80030243
[ 311.944598] radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[ 311.944600] radeon 0000:00:01.0: R_00D834_DMA_STATUS_REG = 0x44C83D57
[ 311.944602] radeon 0000:00:01.0: VM_CONTEXT0_PROTECTION_FAULT_ADDR 0x00000000
[ 311.944604] radeon 0000:00:01.0: VM_CONTEXT0_PROTECTION_FAULT_STATUS 0x00000000
[ 311.944605] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000
[ 311.944607] radeon 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
[ 311.947515] radeon 0000:00:01.0: GRBM_SOFT_RESET=0x0000DF7B
[ 311.947568] radeon 0000:00:01.0: SRBM_SOFT_RESET=0x00010100
[ 311.948723] radeon 0000:00:01.0: GRBM_STATUS = 0x00003828
[ 311.948725] radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x00000007
[ 311.948727] radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007
[ 311.948729] radeon 0000:00:01.0: SRBM_STATUS = 0x20000040
[ 311.948785] radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000
[ 311.948786] radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000
[ 311.948788] radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x00000000
[ 311.948790] radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00000000
[ 311.948791] radeon 0000:00:01.0: R_008680_CP_STAT = 0x00000000
[ 311.948793] radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
[ 311.948795] radeon 0000:00:01.0: R_00D834_DMA_STATUS_REG = 0x44C83D57
[ 311.948860] radeon 0000:00:01.0: GPU reset succeeded, trying to resume |
If I downgrade to 0.36, no problem, no more crash.
What can I do to find the cause of this ?
mpv config :
Code: | vo=gpu-next
profile=gpu-hq |
kernel : 6.1.67
edit : On another machine with different gpu (intel gpu), 0.37.0 works perfectly (windowed & fullscreen), same kernel, same mpv config except profile=gpu instead of profile=gpu-hq.
Thanks.
Last edited by sdauth on Sat Dec 16, 2023 10:37 pm; edited 1 time in total |
|
Back to top |
|
|
sdauth Guru
Joined: 19 Sep 2018 Posts: 569 Location: Ásgarðr
|
Posted: Fri Dec 15, 2023 8:21 pm Post subject: |
|
|
After trial and error test with mpv config, it turns out some things have changed related to profile=gpu-hq settings with mpv 0.37.0 (https://github.com/mpv-player/mpv/issues/12564)
The new gpu-hq profile settings are much more ressource intensive apparently.
Nonetheless, it doesn't seem right that it triggers a gpu reset... If you know / have any more information, please share.
Anyway, commenting out "profile=gpu-hq" in mpv.config solves the issue. (it uses default settings instead) |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6098 Location: Dallas area
|
Posted: Fri Dec 15, 2023 8:38 pm Post subject: |
|
|
any difference between versions "mpv --show-profile=gpu-hq" _________________ PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland |
|
Back to top |
|
|
sdauth Guru
Joined: 19 Sep 2018 Posts: 569 Location: Ásgarðr
|
Posted: Fri Dec 15, 2023 9:40 pm Post subject: |
|
|
Anon-E-moose wrote: | any difference between versions "mpv --show-profile=gpu-hq" |
0.37.0 :
Code: | Profile gpu-hq:
profile=high-quality
scale=ewa_lanczossharp
hdr-peak-percentile=99.995
hdr-contrast-recovery=0.30
deband=yes |
0.36.0 :
Code: | Profile gpu-hq:
scale=spline36
cscale=spline36
dscale=mitchell
dither-depth=auto
hdr-contrast-recovery=0.30
correct-downscaling=yes
linear-downscaling=yes
sigmoid-upscaling=yes
deband=yes |
iirc, scale=ewa_lanczossharp is quite heavy and my gpu is just too old to handle it so this is what probably was causing the issue, although once again, it shouldn't cause a gpu reset. I'll open a bug later on github. |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6098 Location: Dallas area
|
Posted: Fri Dec 15, 2023 10:48 pm Post subject: |
|
|
You should be able to create a profile based on the older one, call it gpu-hq-1 or whatever _________________ PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland |
|
Back to top |
|
|
Ionen Developer
Joined: 06 Dec 2018 Posts: 2720
|
Posted: Sat Dec 16, 2023 9:45 am Post subject: |
|
|
Yes, high-quality (formerly gpu-hq, which is still recognized as an alias) became much more demanding in 0.37. Albeit it should in theory only result in dropped frames, unless something else going on with your card (like not getting enough power).
The new default profile is actually higher quality than 0.36's defaults (similar to old gpu-hq, so no harm in using it), while profile=fast is more like the old defaults.
See https://github.com/mpv-player/mpv/commit/703f158880 that's linked in the issue.
May also want to try vo=gpu-next (this will become the default sooner or later, possibly mpv 0.38 albeit change hasn't been done in master yet) and also gpu-api=vulkan (preferred with esp. with gpu-next). Maybe it'll work out better with your card. |
|
Back to top |
|
|
sdauth Guru
Joined: 19 Sep 2018 Posts: 569 Location: Ásgarðr
|
Posted: Sat Dec 16, 2023 10:35 pm Post subject: |
|
|
Anon-E-moose wrote: | You should be able to create a profile based on the older one, call it gpu-hq-1 or whatever |
Thanks. This works perfectly (and doesn't cause any crash when going fullscreen) so I'll just use those "old" settings.
Code: | ~/.config/mpv/mpv.conf |
Code: | vo=gpu-next
profile=gpu-hq-2
[gpu-hq-2]
scale=spline36
cscale=spline36
dscale=mitchell
dither-depth=auto
hdr-contrast-recovery=0.30
correct-downscaling=yes
linear-downscaling=yes
sigmoid-upscaling=yes
deband=yes |
|
|
Back to top |
|
|
sdauth Guru
Joined: 19 Sep 2018 Posts: 569 Location: Ásgarðr
|
Posted: Sat Dec 16, 2023 11:10 pm Post subject: |
|
|
@Ionen :
I was wondering if the issue was related to mesa so I compiled mesa-23.3.1 (not in stable yet) then restarted X and tried again with profile=high-quality
This time, it works perfectly. No more crashes when going fullscreen although I still notice a very fast glitch on the window when going fullscreen (this was also experienced with previous profile) :
Code: | [...]
[ 2.354][d][cplayer] Run command: cycle, flags=73, args=[name="fullscreen", value="1.000000"]
[ 2.355][v][cplayer] Set property: fullscreen -> 1
[...]
[ 2.490][v][vo/gpu-next/libplacebo] Spent 16.544 ms compiling shader
[...]
|
Ionen wrote: | Yes, high-quality (formerly gpu-hq, which is still recognized as an alias) became much more demanding in 0.37. Albeit it should in theory only result in dropped frames. |
Indeed, while some small files play just fine with high-quality profile, I observe almost 100% gpu utlilization (with radeontop) on every big files (movies) and consequently, framedrop. So although it doesn't cause any crash anymore, I will continue to use previous settings. I can't add a dedicated gpu (I'm using an APU currently) sadly as I don't have any available pcie |
|
Back to top |
|
|
|