Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Is my graphic card to be replace by RMA ?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
lhuge
n00b
n00b


Joined: 28 Jan 2018
Posts: 9

PostPosted: Tue May 08, 2018 4:11 pm    Post subject: Is my graphic card to be replace by RMA ? Reply with quote

Hi,
Since I've bought a new PC (by spare parts) a few months ago, I've suffered of regular kernel panics.
I've check the memory (Kingston HyperX Fury : memtest86), changed my CPU (AMD Ryzen 5 : RMA).
I've recompiled the whole system using less optimized parameters (using CFLAGS="-O2 -march=znver1..." instead of -march=znver1)

Now I suspect the graphic card (Sapphire Radeon RX 550 PULSE).
The reasons why I suspect it is :
- as VLC got segmentation faults (showing VDPAU segfault in console and segfault error 4 in libdrm_amdgpu.so.1.0.0 in logs) when I was jumping during the film, it disappeared when deactivating hardware-accelerated decoding ;
- from times to times, when it is not ...^@^@^@^@^@^@^@^@^@^@^@^@..., /var/log/kern.log shows dozens of amdgpu GPU fault detected.

Otherwise, the motherboard is a MSI B350 TOMAHAWK.

Do you think I should change my graphic card ?
Is there any other way to study before ?

Thanks in advance,
Laurent
_________________
LinuxCounter 313324
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 5212

PostPosted: Tue May 08, 2018 8:11 pm    Post subject: Reply with quote

I have an RX550 and don't see those kind of errors in dmesg, nor segfaults with video decoding (though it does have visual problems in mpv). Is anything else wrong with it? What are the temperatures like?
Back to top
View user's profile Send private message
1clue
Advocate
Advocate


Joined: 05 Feb 2006
Posts: 2389

PostPosted: Tue May 08, 2018 8:51 pm    Post subject: Reply with quote

I'm going to ask a couple "is it plugged in" questions.


  1. Are all the cards in the system seated correctly?
  2. Do all the cards have a screw holding them in?
  3. Are your wires routed so that air flow is not impeded in the case?
  4. Are there any wires stretched to their limit to get from A to B?
  5. When you flip your case upside down is there a rattling sound? (look for loose screws or inadequately fastened hardware)
  6. Are all the wires attached to your video card?
  7. Does the fan on the video card spin when the system gets warm?


I've been bitten by a lot of those. The wires and fastenings don't necessarily need to be directly attached to the video card. Sometimes errors cascade and you only see the end result.
_________________
You can't fix yourself by breaking someone else.
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 6520
Location: Saint Amant, Acadiana

PostPosted: Tue May 08, 2018 9:20 pm    Post subject: Reply with quote

8. Is external power plugged in? If it requires it, that is.
_________________
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 6839

PostPosted: Tue May 08, 2018 10:10 pm    Post subject: Reply with quote

Jaglover wrote:
8. Is external power plugged in? If it requires it, that is.

I would add that too
9. Is power supply enough to feed all components? (videocards are power sucker)
Back to top
View user's profile Send private message
lhuge
n00b
n00b


Joined: 28 Jan 2018
Posts: 9

PostPosted: Fri May 11, 2018 5:34 am    Post subject: Reply with quote

Hi,
Sorry for the delay, but I've tried to answer all your detailed question (and thanks for them).

So there is :
1. Yes, I've checked it
2. Yes, also
3. I think so, there are very few cables
4. No. But do you mean all cables, because I've attached some of them (to help the air flow) and it can be possible one or two are stretched ; do you think I should release all of them ?
5. It's quite heavy (I've tried to shake it) but everything seems fastened
6.There are no wire attached to the video card. I fact I can't understand the question.
7. The fan spin as soon as the system starts. At this time, it is running 956RPM and the card is 35°C. It looks normal
8. No. I've search a long time when I've installed my PC, but it's not required (PCIe 16x)
9. Yeah. I've got a Be Quiet 500W, and the 2 calculators I've found on Internet show about 300W of consumption.

Do you have any advice ?

Thanks,
Laurent
_________________
LinuxCounter 313324
Back to top
View user's profile Send private message
C5ace
Apprentice
Apprentice


Joined: 23 Dec 2013
Posts: 237
Location: Brisbane, Australia

PostPosted: Fri May 11, 2018 10:25 am    Post subject: Reply with quote

Get a low cost (<$20) VGA card. If that works, RMA your Sapphire Radeon RX 550 PULSE card.
Back to top
View user's profile Send private message
lhuge
n00b
n00b


Joined: 28 Jan 2018
Posts: 9

PostPosted: Fri May 11, 2018 12:25 pm    Post subject: Reply with quote

Uh weel...
So you don't have any way to decide whether it is the card or not.
At that rate, I would rather RMA my card : whether the replacement works and it's over, or it doesn't and I'm again on the way. I just wanted to be sure.

Is there a way to get sure about the card failure ?

Thanks,
Laurent
_________________
LinuxCounter 313324
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 6839

PostPosted: Fri May 11, 2018 1:02 pm    Post subject: Reply with quote

lhuge wrote:
Is there a way to get sure about the card failure ?

Most videocards problem are coming from dead fan or memory trouble, and most visible case of a dead videocard are artefacts draw on the screen, like random pixel or blinking parts ; or a totally black screen (even in bios post).

So, no, without these symptoms, all you could say is that your card is crashing when playing videos with hardware accel enable.
What you could do is first looking if the card also crash when a bit stress (some benchmark), or just using another video player with accel enable/disable.
Try another system (livecd) with an accel enable player (ok maybe a livedvd then).


You are free of course to rma any parts you feel are bad, but rma a working product is not something anyone will be happy against, amd could repackage the card to a brand new one (but they won't be happy to have spent time just to see your card is working), but for a reseller your "working" card will be a pure lost, he couldn't rma something that works, and couldn't sell it as brand new.
Because of this, you should always make sure to rma a broken product, try another card from a friend and/or try that card on a friend computer. If you cannot do that (yeah nerd have no friends :D ) than ask reseller if he could for a "tiny" fee diagnose the card for you.
Back to top
View user's profile Send private message
1clue
Advocate
Advocate


Joined: 05 Feb 2006
Posts: 2389

PostPosted: Fri May 11, 2018 1:31 pm    Post subject: Reply with quote

lhuge wrote:
Hi,
...
6.There are no wire attached to the video card. I fact I can't understand the question.
...
Laurent


Many more expensive video cards draw more power than the slot can provide. They have one or more extra power plugs on the card, and you need to attach cables from your power supply to these in order for all the features to work.

If I remember correctly my plugs on the board are 4-pin square orientation. You may need to buy an adapter to get it to hook up to your power supply, or the adapter(s) may come with the card.
_________________
You can't fix yourself by breaking someone else.
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 6520
Location: Saint Amant, Acadiana

PostPosted: Fri May 11, 2018 1:42 pm    Post subject: Reply with quote

I was looking at the picture, it seems there is no power connector - I may be wrong. In any case, 16x PCIe can supply 65 W, but it is on the limit. Consumer electronics do not work well when pushed to the limit.
_________________
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
lhuge
n00b
n00b


Joined: 28 Jan 2018
Posts: 9

PostPosted: Fri May 11, 2018 3:07 pm    Post subject: Reply with quote

Ok, krinn, I understand your purpose.
And I confirm there is no wired power supply to the card (that was what I expected first, and eventually understand there's no)...
I'll try a live USB test (since I've got near hand).

I'll tell you...
_________________
LinuxCounter 313324
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 5212

PostPosted: Sat May 12, 2018 1:52 am    Post subject: Reply with quote

I really doubt this has anything to do with power; this is what my RX550 looks like while I'm pushing it moderately hard (1344p60 video on one screen, 3D on another):
cat /sys/kernel/debug/dri/0/amdgpu_pm_info:
GFX Clocks and Power:
        1500 MHz (MCLK)
        463 MHz (SCLK)
        5.102 W (VDDC)
        4.65 W (VDDCI)
        12.149 W (max GPU)
        12.39 W (average GPU)

GPU Temperature: 48 C
GPU Load: 100 %

Less than 20W, that's basically nothing.
Back to top
View user's profile Send private message
lhuge
n00b
n00b


Joined: 28 Jan 2018
Posts: 9

PostPosted: Sun May 13, 2018 3:12 pm    Post subject: Reply with quote

Hi,
There's a little summary of my progress.

As asked, I've test a live USB to see if it is linked to hardware. I've been used a Ubuntu for installing Gentoo, so I've used it again (ubuntu-18.04-desktop-amd64).
And it works :lol: ! VLC with the same file, jumping back and forth multiple times doesn't present a failure.
So the card seems to works ; and I've to find what parameter to change not to go to segfault.

I've thought that may be a kernel configuration.
So I've upgrade my kernel to 4.16.8 (my former was 4.14.6, and the Ubuntu is 4.15.0). I must precise I've always used my own kernel builds ; and use Gentoo AMD Ryzen and Radeon RX 560 configuration help pages to configure them.
I tried to approach as near as possible to graphics parameters of the Ubuntu's. The main evolutions are the add of
    RETPOLINE
    PCI_QUIRKS
    DRM_AMDGPU_SI, DRM_AMDGPU_CIK and DRM_AMDGPU_USERPTR
    DRM_AMD_DC, DRM_AMD_DC_PRE_VEGA and CHASH
    HSA_AMD
    FB_TILEBLITTING
    and suppression of FB_DDC, FB_BACKLIGHT, FB_RADEON


But it hasn't solved my problem : VLC still segfault :cry:

What could I see then ? Continue to change my kernel (but what direction) ? Or search in libraries ?
Laurent
_________________
LinuxCounter 313324
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 6520
Location: Saint Amant, Acadiana

PostPosted: Sun May 13, 2018 3:38 pm    Post subject: Reply with quote

Are you using vdpau in Ubuntu, too?
_________________
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
lhuge
n00b
n00b


Joined: 28 Jan 2018
Posts: 9

PostPosted: Sun May 20, 2018 9:00 am    Post subject: Reply with quote

Hi,
Back from a trip...

Jaglover, yes, I've used vdpau also : "avcodec decoder: Using G3DVL VDPAU Driver Shared Library version 1.0 for hardware decoding".
Moreover, after searching how to force MPlayer to use vdpau (https://wiki.archlinux.org/index.php/MPlayer#Enabling_VDPAU), I've made a test with my Gentoo, and it works too O_o !
So, how can it works with MPlayer but not with VLC ? They use the same libraries, aren't they ?

And how can I find that different problem that lead from time to time to system fail... ?

Thanks,
_________________
LinuxCounter 313324
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum