Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] mpv-0.37.0 crashes radeon driver (gpu reset)
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Multimedia
View previous topic :: View next topic  
Author Message
sdauth
Guru
Guru


Joined: 19 Sep 2018
Posts: 569
Location: Ásgarðr

PostPosted: Fri Dec 15, 2023 7:06 pm    Post subject: [SOLVED] mpv-0.37.0 crashes radeon driver (gpu reset) Reply with quote

Hi,
I updated mpv from 0.36 to 0.37 and it crashes (and triggers gpu reset) consistently when going fullscreen (tried with different files) :

Code:
[  310.629977] radeon 0000:00:01.0: ring 0 stalled for more than 10470msec
[  310.629987] radeon 0000:00:01.0: GPU lockup (current fence id 0x00000000000003e6 last fence id 0x00000000000003f5 on ring 0)
[  310.672137] radeon 0000:00:01.0: Saved 458 dwords of commands on ring 0.
[  310.672208] radeon 0000:00:01.0: GPU softreset: 0x00000009
[  310.672212] radeon 0000:00:01.0:   GRBM_STATUS               = 0xA04E3828
[  310.672216] radeon 0000:00:01.0:   GRBM_STATUS_SE0           = 0x08800007
[  310.672219] radeon 0000:00:01.0:   GRBM_STATUS_SE1           = 0x00000007
[  310.672223] radeon 0000:00:01.0:   SRBM_STATUS               = 0x20000040
[  310.672280] radeon 0000:00:01.0:   SRBM_STATUS2              = 0x00000000
[  310.672283] radeon 0000:00:01.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[  310.672286] radeon 0000:00:01.0:   R_008678_CP_STALLED_STAT2 = 0x00018000
[  310.672289] radeon 0000:00:01.0:   R_00867C_CP_BUSY_STAT     = 0x00010002
[  310.672292] radeon 0000:00:01.0:   R_008680_CP_STAT          = 0x80030243
[  310.672295] radeon 0000:00:01.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[  310.672298] radeon 0000:00:01.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[  310.672301] radeon 0000:00:01.0:   VM_CONTEXT0_PROTECTION_FAULT_ADDR   0x00000000
[  310.672304] radeon 0000:00:01.0:   VM_CONTEXT0_PROTECTION_FAULT_STATUS 0x00000000
[  310.672307] radeon 0000:00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[  310.672310] radeon 0000:00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
[  310.682831] radeon 0000:00:01.0: GRBM_SOFT_RESET=0x0000DF7B
[  310.682904] radeon 0000:00:01.0: SRBM_SOFT_RESET=0x00010100
[  310.684062] radeon 0000:00:01.0:   GRBM_STATUS               = 0x00003828
[  310.684068] radeon 0000:00:01.0:   GRBM_STATUS_SE0           = 0x00000007
[  310.684073] radeon 0000:00:01.0:   GRBM_STATUS_SE1           = 0x00000007
[  310.684077] radeon 0000:00:01.0:   SRBM_STATUS               = 0x20000040
[  310.684135] radeon 0000:00:01.0:   SRBM_STATUS2              = 0x00000000
[  310.684138] radeon 0000:00:01.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[  310.684142] radeon 0000:00:01.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[  310.684150] radeon 0000:00:01.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[  310.684153] radeon 0000:00:01.0:   R_008680_CP_STAT          = 0x00000000
[  310.684156] radeon 0000:00:01.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[  310.684160] radeon 0000:00:01.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[  310.684223] radeon 0000:00:01.0: GPU reset succeeded, trying to resume
[  310.707992] [drm] PCIE GART of 1024M enabled (table at 0x00000000001D6000).
[  310.708113] radeon 0000:00:01.0: WB enabled
[  310.708119] radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000020000c00
[  310.708499] radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000075a18
[  310.718688] radeon 0000:00:01.0: failed VCE resume (-22).
[  310.718693] radeon 0000:00:01.0: fence driver on ring 1 use gpu addr 0x0000000020000c04
[  310.718697] radeon 0000:00:01.0: fence driver on ring 2 use gpu addr 0x0000000020000c08
[  310.718700] radeon 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c
[  310.718704] radeon 0000:00:01.0: fence driver on ring 4 use gpu addr 0x0000000020000c10
[  310.737009] [drm] ring test on 0 succeeded in 2 usecs
[  310.737019] [drm] ring test on 3 succeeded in 3 usecs
[  310.737026] [drm] ring test on 4 succeeded in 3 usecs
[  310.783071] [drm] ring test on 5 succeeded in 2 usecs
[  310.803097] [drm] UVD initialized successfully.
[  311.901908] [drm:r600_ib_test] *ERROR* radeon: fence wait timed out.
[  311.901927] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-110).
[  311.944524] radeon 0000:00:01.0: GPU softreset: 0x00000009
[  311.944528] radeon 0000:00:01.0:   GRBM_STATUS               = 0xA04E3828
[  311.944530] radeon 0000:00:01.0:   GRBM_STATUS_SE0           = 0x08800007
[  311.944532] radeon 0000:00:01.0:   GRBM_STATUS_SE1           = 0x00000007
[  311.944534] radeon 0000:00:01.0:   SRBM_STATUS               = 0x20000040
[  311.944589] radeon 0000:00:01.0:   SRBM_STATUS2              = 0x00000000
[  311.944591] radeon 0000:00:01.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[  311.944593] radeon 0000:00:01.0:   R_008678_CP_STALLED_STAT2 = 0x00018000
[  311.944595] radeon 0000:00:01.0:   R_00867C_CP_BUSY_STAT     = 0x00000002
[  311.944596] radeon 0000:00:01.0:   R_008680_CP_STAT          = 0x80030243
[  311.944598] radeon 0000:00:01.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[  311.944600] radeon 0000:00:01.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[  311.944602] radeon 0000:00:01.0:   VM_CONTEXT0_PROTECTION_FAULT_ADDR   0x00000000
[  311.944604] radeon 0000:00:01.0:   VM_CONTEXT0_PROTECTION_FAULT_STATUS 0x00000000
[  311.944605] radeon 0000:00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[  311.944607] radeon 0000:00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
[  311.947515] radeon 0000:00:01.0: GRBM_SOFT_RESET=0x0000DF7B
[  311.947568] radeon 0000:00:01.0: SRBM_SOFT_RESET=0x00010100
[  311.948723] radeon 0000:00:01.0:   GRBM_STATUS               = 0x00003828
[  311.948725] radeon 0000:00:01.0:   GRBM_STATUS_SE0           = 0x00000007
[  311.948727] radeon 0000:00:01.0:   GRBM_STATUS_SE1           = 0x00000007
[  311.948729] radeon 0000:00:01.0:   SRBM_STATUS               = 0x20000040
[  311.948785] radeon 0000:00:01.0:   SRBM_STATUS2              = 0x00000000
[  311.948786] radeon 0000:00:01.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[  311.948788] radeon 0000:00:01.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[  311.948790] radeon 0000:00:01.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[  311.948791] radeon 0000:00:01.0:   R_008680_CP_STAT          = 0x00000000
[  311.948793] radeon 0000:00:01.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[  311.948795] radeon 0000:00:01.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[  311.948860] radeon 0000:00:01.0: GPU reset succeeded, trying to resume


If I downgrade to 0.36, no problem, no more crash.
What can I do to find the cause of this ?

mpv config :

Code:
vo=gpu-next
profile=gpu-hq


kernel : 6.1.67

edit : On another machine with different gpu (intel gpu), 0.37.0 works perfectly (windowed & fullscreen), same kernel, same mpv config except profile=gpu instead of profile=gpu-hq.
Thanks.


Last edited by sdauth on Sat Dec 16, 2023 10:37 pm; edited 1 time in total
Back to top
View user's profile Send private message
sdauth
Guru
Guru


Joined: 19 Sep 2018
Posts: 569
Location: Ásgarðr

PostPosted: Fri Dec 15, 2023 8:21 pm    Post subject: Reply with quote

After trial and error test with mpv config, it turns out some things have changed related to profile=gpu-hq settings with mpv 0.37.0 (https://github.com/mpv-player/mpv/issues/12564)
The new gpu-hq profile settings are much more ressource intensive apparently.
Nonetheless, it doesn't seem right that it triggers a gpu reset... If you know / have any more information, please share.

Anyway, commenting out "profile=gpu-hq" in mpv.config solves the issue. (it uses default settings instead)
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Fri Dec 15, 2023 8:38 pm    Post subject: Reply with quote

any difference between versions "mpv --show-profile=gpu-hq"
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
sdauth
Guru
Guru


Joined: 19 Sep 2018
Posts: 569
Location: Ásgarðr

PostPosted: Fri Dec 15, 2023 9:40 pm    Post subject: Reply with quote

Anon-E-moose wrote:
any difference between versions "mpv --show-profile=gpu-hq"

0.37.0 :
Code:
Profile gpu-hq:
 profile=high-quality
  scale=ewa_lanczossharp
  hdr-peak-percentile=99.995
  hdr-contrast-recovery=0.30
  deband=yes


0.36.0 :
Code:
Profile gpu-hq:
 scale=spline36
 cscale=spline36
 dscale=mitchell
 dither-depth=auto
 hdr-contrast-recovery=0.30
 correct-downscaling=yes
 linear-downscaling=yes
 sigmoid-upscaling=yes
 deband=yes


iirc, scale=ewa_lanczossharp is quite heavy and my gpu is just too old to handle it so this is what probably was causing the issue, although once again, it shouldn't cause a gpu reset. I'll open a bug later on github.
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Fri Dec 15, 2023 10:48 pm    Post subject: Reply with quote

You should be able to create a profile based on the older one, call it gpu-hq-1 or whatever
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
Ionen
Developer
Developer


Joined: 06 Dec 2018
Posts: 2720

PostPosted: Sat Dec 16, 2023 9:45 am    Post subject: Reply with quote

Yes, high-quality (formerly gpu-hq, which is still recognized as an alias) became much more demanding in 0.37. Albeit it should in theory only result in dropped frames, unless something else going on with your card (like not getting enough power).

The new default profile is actually higher quality than 0.36's defaults (similar to old gpu-hq, so no harm in using it), while profile=fast is more like the old defaults.

See https://github.com/mpv-player/mpv/commit/703f158880 that's linked in the issue.

May also want to try vo=gpu-next (this will become the default sooner or later, possibly mpv 0.38 albeit change hasn't been done in master yet) and also gpu-api=vulkan (preferred with esp. with gpu-next). Maybe it'll work out better with your card.
Back to top
View user's profile Send private message
sdauth
Guru
Guru


Joined: 19 Sep 2018
Posts: 569
Location: Ásgarðr

PostPosted: Sat Dec 16, 2023 10:35 pm    Post subject: Reply with quote

Anon-E-moose wrote:
You should be able to create a profile based on the older one, call it gpu-hq-1 or whatever


Thanks. This works perfectly (and doesn't cause any crash when going fullscreen) so I'll just use those "old" settings.

Code:
~/.config/mpv/mpv.conf


Code:
vo=gpu-next
profile=gpu-hq-2

[gpu-hq-2]
scale=spline36
cscale=spline36
dscale=mitchell
dither-depth=auto
hdr-contrast-recovery=0.30
correct-downscaling=yes
linear-downscaling=yes
sigmoid-upscaling=yes
deband=yes
Back to top
View user's profile Send private message
sdauth
Guru
Guru


Joined: 19 Sep 2018
Posts: 569
Location: Ásgarðr

PostPosted: Sat Dec 16, 2023 11:10 pm    Post subject: Reply with quote

@Ionen :
I was wondering if the issue was related to mesa so I compiled mesa-23.3.1 (not in stable yet) then restarted X and tried again with profile=high-quality
This time, it works perfectly. No more crashes when going fullscreen although I still notice a very fast glitch on the window when going fullscreen (this was also experienced with previous profile) :

Code:
[...]
[   2.354][d][cplayer] Run command: cycle, flags=73, args=[name="fullscreen", value="1.000000"]
[   2.355][v][cplayer] Set property: fullscreen -> 1
[...]
[   2.490][v][vo/gpu-next/libplacebo] Spent 16.544 ms compiling shader
[...]


Ionen wrote:
Yes, high-quality (formerly gpu-hq, which is still recognized as an alias) became much more demanding in 0.37. Albeit it should in theory only result in dropped frames.


Indeed, while some small files play just fine with high-quality profile, I observe almost 100% gpu utlilization (with radeontop) on every big files (movies) and consequently, framedrop. So although it doesn't cause any crash anymore, I will continue to use previous settings. I can't add a dedicated gpu (I'm using an APU currently) sadly as I don't have any available pcie :o
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Multimedia All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum