Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Kernel & Hardware
  • Search

[SOLVED - bad RAM] Nvidia Xid errors

Kernel not recognizing your hardware? Problems with power management or PCMCIA? What hardware is compatible with Gentoo? See here. (Only for kernels supported by Gentoo.)
Post Reply
Advanced search
4 posts • Page 1 of 1
Author
Message
KLarsen
n00b
n00b
Posts: 61
Joined: Fri Dec 30, 2005 7:49 pm
Location: Spain

[SOLVED - bad RAM] Nvidia Xid errors

  • Quote

Post by KLarsen » Fri Nov 16, 2018 5:05 pm

Since around 2 weeks ago I've started getting NVRM Xid errors when gaming. I have a GTX660 running the latest stable nvidia-drivers (396.54). Around the time the errors started I put 2 new RAM sticks in, doubling the amount of memory.
After a LOT of testing the culprit seems to be the new RAM sticks, when they are installed I get errors, when I take them out I can play games for hours with no errors (might just be a coincidence...).
However, I get no errors in memtest86, even after running it for over 20 hours. The vendor won't initially RMA the memory without some error in memtest (I'm going to pressure them on this though).
I've also tried downgrading nvidia-drivers and the kernel, recompiled the nvidia-drivers several times, put the GPU in the other PCI-E socket, shuffled the RAM around...
The only thing I haven't tried yet is downgrading xorg-server and xorg-drivers, which did get upgraded around the time the errors started happening. I will do this tomorrow.

These are the errors that appear in the log:

Code: Select all

Nov 4 12:35:34 unicorn kernel: NVRM: GPU at PCI:0000:0a:00: GPU-dfde4129-ba3c-74bc-84aa-ea76a1cf90ed
Nov 4 12:35:34 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 69, Class Error: ChId 0058, Class 0000a097, Offset 00002384, Data 40000001, ErrorCode 0000000c
Nov 4 12:57:30 unicorn kernel: NVRM: GPU at PCI:0000:0a:00: GPU-dfde4129-ba3c-74bc-84aa-ea76a1cf90ed
Nov 4 12:57:30 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 69, Class Error: ChId 0030, Class 0000a097, Offset 00001c80, Data 40000000, ErrorCode 0000000c
Nov 4 23:53:41 unicorn kernel: NVRM: GPU at PCI:0000:0a:00: GPU-dfde4129-ba3c-74bc-84aa-ea76a1cf90ed
Nov 4 23:53:41 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 12, COCOD 00000050 beef3901 0000a040 000001b8 1f789000
Nov 5 22:44:50 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 32, Channel ID 00000050 intr 00040000
Nov 5 22:54:26 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 12, COCOD 00000050 beef3901 0000a040 000001b8 2faac600
Nov 5 23:11:48 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 31, Ch 00000050, engmask 00000101, intr 10000000
Nov 6 23:48:26 unicorn kernel: NVRM: GPU at PCI:0000:0a:00: GPU-dfde4129-ba3c-74bc-84aa-ea76a1cf90ed
Nov 6 23:48:26 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 69, Class Error: ChId 0058, Class 0000a097, Offset 00001b00, Data 00004100, ErrorCode 0000000c
Nov 6 23:48:26 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 13, Graphics Exception: EXTRA_MACRO_DATA
Nov 6 23:48:26 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 13, Graphics Exception: ESR 0x404490=0x80000002
Nov 6 23:48:26 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 13, Graphics Exception: ChID 0058, Class 0000a097, Offset 00001b00, Data 00004100
Nov 6 23:48:35 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 12, COCOD 00000058 beef9097 0000a097 00001414 00000000
Nov 6 23:51:10 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 69, Class Error: ChId 0058, Class 0000a097, Offset 00001418, Data 00000004, ErrorCode 0000000c
Nov 7 13:02:52 unicorn kernel: NVRM: GPU at PCI:0000:0a:00: GPU-dfde4129-ba3c-74bc-84aa-ea76a1cf90ed
Nov 7 13:02:52 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 12, COCOD 00000038 beef3901 0000a040 000001b8 ffffffff
Nov 7 13:06:37 unicorn kernel: NVRM: Xid (PCI:0000:0a:00): 12, COCOD 00000038 beef3901 0000a040 000001b8 ffffffff
I usually get Xid 69, which according to https://docs.nvidia.com/deploy/xid-errors/index.html is either a hardware error or driver error. None of these errors point to a RAM problem.

I can run Unigine benchmark through several passes without errors, only Steam games give me problems. Also, once in a while the KDE/Plasma compositor stops unexpectedly (no errors in the log though). In general the system is totally stable, running 24/7 and I can reliantly compile with no errors.

So, can anyone help me and suggest something else to try and pinpoint the problem? Is the GPU going bad? Is the system RAM really the culprit? Any help would be much appreciated.
Last edited by KLarsen on Sat Nov 17, 2018 12:40 pm, edited 1 time in total.
Top
bunder
Bodhisattva
Bodhisattva
Posts: 5956
Joined: Sat Apr 10, 2004 5:13 am

  • Quote

Post by bunder » Fri Nov 16, 2018 7:50 pm

I see you tried reseating the card... are you overclocking the card at all? How good is your case cooling? Power supply rails?
Neddyseagoon wrote:The problem with leaving is that you can only do it once and it reduces your influence.
banned from #gentoo since sept 2017
Top
KLarsen
n00b
n00b
Posts: 61
Joined: Fri Dec 30, 2005 7:49 pm
Location: Spain

  • Quote

Post by KLarsen » Fri Nov 16, 2018 8:06 pm

The card is factory overclocked.
Cooling should be good, neither the GPU nor the CPU gets above 60°C with the case closed. Opening the case, I still get errors.
I do have another PSU I can check, I'll do so tomorrow.
Top
KLarsen
n00b
n00b
Posts: 61
Joined: Fri Dec 30, 2005 7:49 pm
Location: Spain

  • Quote

Post by KLarsen » Sat Nov 17, 2018 12:39 pm

I finally got errors in memtest86, I left it overnight for the third time and this morning it had found 64 errors. Time for RMA.
Top
Post Reply

4 posts • Page 1 of 1

Return to “Kernel & Hardware”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic