Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
EQ Overflow from nvidia Crashing X
View unanswered posts
View posts from last 24 hours
View posts from last 7 days

 
Reply to topic    Gentoo Forums Forum Index Desktop Environments
View previous topic :: View next topic  
Author Message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1435
Location: Berlin, Germany

PostPosted: Mon Jun 11, 2012 1:09 pm    Post subject: EQ Overflow from nvidia Crashing X Reply with quote

So, I've been recently updating, and chrome, xorg-server, and adobe-flash have all recently been emerged. I seem to be having this problem, whereby my WM starts having screen corruption (random rows of pixels, sections of the screen black out, etc etc), gradually worsening until the X server freezes, uses 100% of a core, and I have to kill it from an ssh connection.

Now I'm using Opera, to see if it's just chrome (no issues with Opera so far), or something more general.

There's nothing in the Xorg.log.0 file, so I'm not sure how to track this down. I have hardware acceleration enabled, and VDPAU disabled (the former to get rid of the blue meanies, the latter to keep embedded videos from leaking all over the screen).

Any suggestions how I might pin down this bug?

Thanks,

EE

EDIT: um, it might maybe be that a loose DVI cable was causing all the trouble, since I was having all the same problems from the BIOS screen, and - now that I gave my monitor a reach-around and screwed the cable in tightly (*cough*) - I don't seem to have any problems. So, uh, whoops.

EDIT 2: I take that last one back. I tightened the cable, and things have been fine for a while, but now they are back where they were: intermittently, with no log entries, and no error messages, the screen starts randomly flickering (like, a flash every 5--10 minutes), and I suspect, that - left unchecked - the flickers will increase in frequency and severity, until I get to 100% CPU usage from X, and the system freezes. Killing X over ssh from another box seems to fix it, and I seem to be able to avert any problems by killing both my browser (in this case, also Opera), and gkrellm.

But like I said: I'm getting no error messages anywhere, so I'm just flying blind right now. Any advice on how I can figure out what's going wrong?

EDIT 3: Well, I got smart and ssh'ed into the box while it was frozen, and now I have some log messages. Here's what I see in Xorg.log.0:

Code:
[ 27631.970] (**) NVIDIA(0):     has been enabled on all display devices.)
[ 29337.163] [mi] EQ overflowing.  Additional events will be discarded until existing events are processed.
[ 29337.163]
[ 29337.163] Backtrace:
[ 29337.163] 0: /usr/bin/X (xorg_backtrace+0x36) [0x5684c6]
[ 29337.163] 1: /usr/bin/X (mieqEnqueue+0x273) [0x549563]
[ 29337.163] 2: /usr/bin/X (0x400000+0x49a8d) [0x449a8d]
[ 29337.163] 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f61925e7000+0x60f0) [0x7f61925ed0f0]
[ 29337.163] 4: /usr/bin/X (0x400000+0x711c7) [0x4711c7]
[ 29337.163] 5: /usr/bin/X (0x400000+0x9583a) [0x49583a]
[ 29337.163] 6: /lib64/libc.so.6 (0x7f6197fcd000+0x35620) [0x7f6198002620]
[ 29337.163] 7: linux-vdso.so.1 (__vdso_gettimeofday+0x59) [0x7fff79d93929]
[ 29337.163] 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f6193132000+0x883a5) [0x7f61931ba3a5]
[ 29337.163] 9: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f6193132000+0xff713) [0x7f6193231713]
[ 29337.163] 10: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f6193132000+0xc26a2) [0x7f61931f46a2]
[ 29337.163] 11: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f6193132000+0x52a90c) [0x7f619365c90c]
[ 29337.163] 12: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f6193132000+0x4f98ac) [0x7f619362b8ac]
[ 29337.163] 13: /usr/bin/X (BlockHandler+0x4f) [0x439b4f]
[ 29337.163] 14: /usr/bin/X (WaitForSomething+0x12a) [0x565aea]
[ 29337.163] 15: /usr/bin/X (0x400000+0x35a32) [0x435a32]
[ 29337.163] 16: /usr/bin/X (0x400000+0x24f0a) [0x424f0a]
[ 29337.163] 17: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x7f6197fef4bd]
[ 29337.163] 18: /usr/bin/X (0x400000+0x24aa9) [0x424aa9]
[ 29337.163]
[ 29337.163] [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack.
[ 29337.163] [mi] mieq is *NOT* the cause.  It is a victim.
[ 29337.571] [mi] EQ overflow continuing.  100 events have been dropped.
This message repeats several times, with the following added a few more backtraces down:
Code:
[ 29348.195]
[ 29348.195] [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack.
[ 29348.195] [mi] mieq is *NOT* the cause.  It is a victim.
[ 29348.720] (WW) NVIDIA(0): WAIT (2, 6, 0x8000, 0x0000a9f0, 0x00004e54)
[ 29355.720] (WW) NVIDIA(0): WAIT (1, 6, 0x8000, 0x0000a9f0, 0x00004e54)
[ 29355.720] [mi] Increasing EQ size to 1024 to prevent dropped events.
[ 29355.721] [mi] EQ processing has resumed after 25 dropped events.
[ 29355.721] [mi] This may be caused my a misbehaving driver monopolizing the server's resources.
[ 29367.458] [mi] EQ overflowing.  Additional events will be discarded until existing events are processed.


So now that I have some errors logged, what *is* the actual culprit? Is this a legitimate problem in the nvidia-driver? A misconfigured kernel?

Thanks for the help,

EE

PS: I changed the title to reflect more accurately the problem.
Back to top
View user's profile Send private message
bjlockie
Veteran
Veteran


Joined: 18 Oct 2002
Posts: 1186
Location: Canada

PostPosted: Sat Sep 21, 2013 3:49 pm    Post subject: Reply with quote

If you find a version of the xorg server that doesn't have this bug then that'd be great.

There is a bug report from 2013 but it sounds like the bug has existed for much longer.

https://bugs.freedesktop.org/show_bug.cgi?id=62912

I haven't found any post with a work around.

I have 2 monitors and it seems to happen less since I disconnected the second monitor.
_________________
AMD FX6100 CPU, 16 GiB RAM, OCZ Vertex 3 SSD
ASRock 970 Extreme3 motherboard with S/PDIF audio
Galaxy-NVidia GeForce 8800GT video card, Cyber Power CP550HG USB UPS
Back to top
View user's profile Send private message
PaulBredbury
Watchman
Watchman


Joined: 14 Jul 2005
Posts: 7310

PostPosted: Thu Oct 10, 2013 1:27 am    Post subject: Reply with quote

I just encountered this problem. What seems to help, is reverting back (from acpi_pm) to a quicker clocksource. In the bootloader's kernel command-line:

Code:
hpet=disable clocksource=tsc processor.max_cstate=1


Setting max_cstate is needed for my TSC to be stable. Here's some good clocksource info.

Edit: Bah, still not fixed completely. 0001-mieq-Bump-default-queue-size-to-512.patch should help.

Edit2: I've had no problems after changing the clocksource and adding Fedora's "512 queue size" patch 8)


Last edited by PaulBredbury on Sat Oct 19, 2013 4:31 pm; edited 1 time in total
Back to top
View user's profile Send private message
Ivion
n00b
n00b


Joined: 23 Jan 2003
Posts: 45
Location: Amsterdam

PostPosted: Mon Oct 14, 2013 7:23 pm    Post subject: Reply with quote

Just adding my voice to those having this problem.

I've had this problem for quite a while and I've searched and read reports of many other people having this problem, yet no solutions were to be found anywhere. It happens to me sporadically, usually when opening a video-file with mplayer/mpv or when opening a new link in Firefox. But it happens sporadically enough to not be a major issue, it happens maybe once every 1 to 2 months.

The most recent log of the crash can be found here. To summarize:
Code:
(EE) [mi] EQ overflowing.  Additional events will be discarded until existing events are processed.
(EE)
(EE) Backtrace:
(EE) 0: /usr/bin/X (xorg_backtrace+0x34) [0x592c54]
(EE) 1: /usr/bin/X (mieqEnqueue+0x263) [0x573a53]
(EE) 2: /usr/bin/X (0x400000+0x4edb4) [0x44edb4]
(EE) 3: /usr/bin/X (xf86PostMotionEvent+0xce) [0x489b6e]
(EE) 4: /usr/lib/xorg/modules/input/mouse_drv.so (0x7f9fd3058000+0x761f) [0x7f9fd305f61f]
(EE) 5: /usr/lib/xorg/modules/input/mouse_drv.so (0x7f9fd3058000+0x7c58) [0x7f9fd305fc58]
(EE) 6: /usr/lib/xorg/modules/input/mouse_drv.so (0x7f9fd3058000+0x4645) [0x7f9fd305c645]
(EE) 7: /usr/bin/X (0x400000+0x79477) [0x479477]
(EE) 8: /usr/bin/X (0x400000+0xa21f7) [0x4a21f7]
(EE) 9: /lib64/libpthread.so.0 (0x7f9fd8fe6000+0x10bf0) [0x7f9fd8ff6bf0]
(EE) 10: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f9fd3264000+0x6388d) [0x7f9fd32c788d]
(EE) 11: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f9fd3264000+0xdd42a) [0x7f9fd334142a]
(EE) 12: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f9fd3264000+0x93c12) [0x7f9fd32f7c12]
(EE) 13: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f9fd3264000+0x4c153c) [0x7f9fd372553c]
(EE) 14: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0x7f9fd3264000+0x49d089) [0x7f9fd3701089]
(EE) 15: /usr/bin/X (BlockHandler+0x44) [0x43e364]
(EE) 16: /usr/bin/X (WaitForSomething+0x11d) [0x59016d]
(EE) 17: /usr/bin/X (0x400000+0x39f52) [0x439f52]
(EE) 18: /usr/bin/X (0x400000+0x28dc4) [0x428dc4]
(EE) 19: /lib64/libc.so.6 (__libc_start_main+0xed) [0x7f9fd7c8760d]
(EE) 20: /usr/bin/X (0x400000+0x2910d) [0x42910d]
(EE)
(EE) [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack.
(EE) [mi] mieq is *NOT* the cause.  It is a victim.
(EE) [mi] EQ overflow continuing.  100 events have been dropped.

This repeats a few times, with the dropped events increasing, until:
Code:
[953276.863] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x000047ac, 0x00007504)
[953276.863] [mi] Increasing EQ size to 512 to prevent dropped events.
[953276.864] [mi] EQ processing has resumed after 643 dropped events.
[953276.864] [mi] This may be caused my a misbehaving driver monopolizing the server's resources.
[953279.864] (WW) NVIDIA(0): WAIT (2, 8, 0x8000, 0x000047ac, 0x0000ca84)
[953286.864] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x000047ac, 0x0000ca84)
[953289.865] (WW) NVIDIA(0): WAIT (2, 8, 0x8000, 0x000047ac, 0x0000fe6c)
[953296.865] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x000047ac, 0x0000fe6c)
[953299.866] (WW) NVIDIA(0): WAIT (2, 8, 0x8000, 0x000047ac, 0x0000124c)
[953306.866] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x000047ac, 0x0000124c)
[953309.867] (WW) NVIDIA(0): WAIT (2, 8, 0x8000, 0x000047ac, 0x0000249c)
[953316.867] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x000047ac, 0x0000249c)
[953319.868] (WW) NVIDIA(0): WAIT (2, 8, 0x8000, 0x000047ac, 0x0000467c)
[953326.868] (WW) NVIDIA(0): WAIT (1, 8, 0x8000, 0x000047ac, 0x0000467c)


I have saved the relevant log from the previous 2 times this crash happened, those two crashes were with a different system (Core 2 Duo -> Core i5) with different archs (x86 -> amd64), but with the same graphics card (GeForce GTX 550 Ti). Here and here.

The post by PaulBredbury made me check which clocksource my system is using. It happens to be TSC already, so that doesn't seem to be the problem. I might try out the mieq patch for Xorg, but I'm not sure whether that will have any effect - since the dropped events exceed 512 in any case. The log even says "EQ processing has resumed after X dropped events.", yet X stays frozen and there's nothing I can do besides a hard reset.

It's a really puzzling error, that's for sure.
_________________
This post was created by millions of tiny cows jumping around on my keyboard.
Back to top
View user's profile Send private message
PaulBredbury
Watchman
Watchman


Joined: 14 Jul 2005
Posts: 7310

PostPosted: Tue Oct 15, 2013 1:28 am    Post subject: Reply with quote

Ivion wrote:
the dropped events exceed 512 in any case

Yes but that's *after* it's diverted effort to moan, when 256 is hit.

Try 0001-mieq-Bump-default-queue-size-to-512.patch, because it seems to have fixed the issue for me ;)

Edit: This patch has been committed to the xorg-server 1.15 branch.
Back to top
View user's profile Send private message
bandurvp
n00b
n00b


Joined: 04 Dec 2010
Posts: 17

PostPosted: Sat Jan 04, 2014 3:21 pm    Post subject: Reply with quote

I have the same problem on a mid-2010 MacBook, have had it since I got the laptop, and have been unable to find a solution. I would try the clocksource approach, but though /proc/cpuinfo shows ``tsc'' and ``constant_tsc'', /sys/bus/clocksource/devices/clocksource0/available_clocksource only shows ``hpet acpi_pm''. I also asked a question about this issue here. I apologize, but I only came across this thread when I also discovered the ``EQ'' messages in the Xorg log.
Back to top
View user's profile Send private message
PaulBredbury
Watchman
Watchman


Joined: 14 Jul 2005
Posts: 7310

PostPosted: Sat Jan 04, 2014 3:39 pm    Post subject: Reply with quote

Try acpi_pm (it's probably quicker) instead of hpet.

Try the patch that, ahem, I keep mentioning.

Don't use nvidia 331.20 - it messes up apps.
_________________
Improve your font rendering and ALSA sound
Back to top
View user's profile Send private message
bandurvp
n00b
n00b


Joined: 04 Dec 2010
Posts: 17

PostPosted: Sat Jan 04, 2014 4:04 pm    Post subject: Reply with quote

Thanks very much for the reply PaulBredbury, I have downgraded nvidia-drivers to 319.76 and changed the clocksource to acpi_pm, which seems to have at least improved the situation: I have not yet succeeded in reproducing the problem in the usual way. I'll wait on the patch until Xorg 1.15 makes it into Portage simply because I haven't worked with custom patches before. Thanks again!
Back to top
View user's profile Send private message
byebytoad
n00b
n00b


Joined: 10 Mar 2014
Posts: 2

PostPosted: Mon Mar 10, 2014 10:45 am    Post subject: Reply with quote

PaulBredbury wrote:


Edit: Bah, still not fixed completely. 0001-mieq-Bump-default-queue-size-to-512.patch should help.

Edit2: I've had no problems after changing the clocksource and adding Fedora's "512 queue size" patch 8)


Hi, I encountered the same issue and I wanted to ask
how can I apply the patch?

I'm on a debian testing installation
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Mon Mar 10, 2014 1:38 pm    Post subject: Reply with quote

http://www.cyberciti.biz/faq/appy-patch-file-using-patch-command/

But considering how hard that patch is, you can just edit the file with your favourite text editor and replace 256 with 512
Back to top
View user's profile Send private message
byebytoad
n00b
n00b


Joined: 10 Mar 2014
Posts: 2

PostPosted: Mon Mar 10, 2014 5:50 pm    Post subject: Reply with quote

krinn wrote:
http://www.cyberciti.biz/faq/appy-patch-file-using-patch-command/

But considering how hard that patch is, you can just edit the file with your favourite text editor and replace 256 with 512


Thanks for your reply.
My issue would be figuring which file should be edited.
Sorry for my dumbness but I would appreciate if anyone could tell me which file needs to be edited.

I looked into the patch in the hope of finding the answer to that, but I didn't find it as neither it seems it was mentioned in the discussion.
Only thing I understood is that it should be a xorg file, and I doubt it's the xorg.conf one.
Back to top
View user's profile Send private message
F1r31c3r
Tux's lil' helper
Tux's lil' helper


Joined: 31 Aug 2007
Posts: 107
Location: UK

PostPosted: Fri Jan 09, 2015 3:49 pm    Post subject: seems more common than not Reply with quote

Seems to be a common issue even with the latest 346.22 drivers.

Quote:
Error [mi] EQ overflowing. Additional events will be discarded until existing events are processed.
Error
Error Backtrace:
Error 0: /usr/bin/X (mieqEnqueue+0x22b) [0x5b624b]
Error 1: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 3: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 4: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 5: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 6: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 9: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 10: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack.
Error [mi] mieq is *NOT* the cause. It is a victim.
Error [mi] EQ overflow continuing. 100 events have been dropped.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] EQ overflow continuing. 200 events have been dropped.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] EQ overflow continuing. 300 events have been dropped.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] EQ overflow continuing. 400 events have been dropped.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] EQ overflow continuing. 500 events have been dropped.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] EQ overflow continuing. 600 events have been dropped.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] EQ overflow continuing. 700 events have been dropped.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] EQ overflow continuing. 800 events have been dropped.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] EQ overflow continuing. 900 events have been dropped.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Error [mi] EQ overflow continuing. 1000 events have been dropped.
Error [mi] No further overflow reports will be reported until the clog is cleared.
Error
Error Backtrace:
Error 0: /usr/bin/X (QueuePointerEvents+0x52) [0x458132]
Error 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x51ea) [0x7f505a41feba]
Error 2: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x5ac0) [0x7f505a421490]
Error 3: /usr/bin/X (DPMSSupported+0x188) [0x488d28]
Error 4: /usr/bin/X (xf86SerialModemClearBits+0x22a) [0x4b9c4a]
Error 5: /lib64/libc.so.6 (killpg+0x40) [0x7f5062815f8f]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 6: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b587cc0]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 7: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (?+0x40) [0x7f505b588380]
Error 8: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (nvidiaAddDrawableHandler+0x4203a) [0x7f505b628c0a]
Error unw_get_proc_name failed: no unwind info found [-10]
Error 9: /usr/lib64/opengl/nvidia/extensions/libglx.so (?+0x4203a) [0x7f505f5e7aa0]
Error
Information [ 9820.526] [mi] Increasing EQ size to 1024 to prevent dropped events.
Information [ 9820.526] [mi] EQ processing has resumed after 1689 dropped events.
Information [ 9820.526] [mi] This may be caused my a misbehaving driver monopolizing the server's resources.


The 346.22 driver also causes allot of screen tearing when settings are full on. Not sure where the problem relies, could it be nvidia driver or part of the Xorg, EGL, Mesa libraries.

Looking at the date of the posts in this thread one can assume it is a ongoing error from driver 331 to present.
The Nvidia XID's do point it as being a driver bug, i have posted a support ticket with Nvidia. No response as yet, anyone else got any ideas?
_________________
A WikI, A collection of mass misinformation based on opinion and manipulation by a deception of freedom.
If we know the truth, then we should be free from deception (John 8:42-47 )
Back to top
View user's profile Send private message
mani001
Guru
Guru


Joined: 04 Dec 2004
Posts: 481
Location: Oleiros

PostPosted: Tue Nov 17, 2015 10:30 pm    Post subject: Reply with quote

I just hit this bug :?

Any progress on this?
Back to top
View user's profile Send private message
F1r31c3r
Tux's lil' helper
Tux's lil' helper


Joined: 31 Aug 2007
Posts: 107
Location: UK

PostPosted: Tue Nov 17, 2015 10:58 pm    Post subject: Reply with quote

mani001 wrote:
I just hit this bug :?

Any progress on this?


Yes, i had fixed this error while collaborating with Nvidia.

It is a hardware fault due to a design error on the cooling fan/heatsink.

Which graphics card do you have, is it the GTX 700 series?
_________________
A WikI, A collection of mass misinformation based on opinion and manipulation by a deception of freedom.
If we know the truth, then we should be free from deception (John 8:42-47 )
Back to top
View user's profile Send private message
mani001
Guru
Guru


Joined: 04 Dec 2004
Posts: 481
Location: Oleiros

PostPosted: Tue Nov 17, 2015 11:01 pm    Post subject: Reply with quote

I just purchased a GT 720...please tell me they still care about it :D
Back to top
View user's profile Send private message
F1r31c3r
Tux's lil' helper
Tux's lil' helper


Joined: 31 Aug 2007
Posts: 107
Location: UK

PostPosted: Tue Nov 17, 2015 11:13 pm    Post subject: Reply with quote

mani001 wrote:
I just purchased a GT 720...please tell me they still care about it :D


Well i hope they do, they did with my case back in February of this year.

The problem is that the card and heat-sink are only in good thermal contact provided the card is perfectly straight. In a tower case the weight of the card and heat sink case the PCB to droop down and causes the heat sink to make poor thermal contact with the GPU and memory chips.

There is no support for the board what so ever other than the PCB screws on the case form factors bracket. if the card droops any more than 2' it parts contact with the GPU. Remember the PCIe Power connectors at the end of the card tend to pull the card down also and cause the card to fail then produce this error.

They replaced mine with a refurbished card but if yours is still within the 14 day period look at changing it for one that has full support for heat-sink and PCB attached to the case form factors bracket.

I have a GTX870 and that has the heat-sink fully supported as does the GTX970 which is also fully supported.
_________________
A WikI, A collection of mass misinformation based on opinion and manipulation by a deception of freedom.
If we know the truth, then we should be free from deception (John 8:42-47 )
Back to top
View user's profile Send private message
mani001
Guru
Guru


Joined: 04 Dec 2004
Posts: 481
Location: Oleiros

PostPosted: Wed Nov 18, 2015 8:06 am    Post subject: Reply with quote

If you are implying it's a temperature thing (because the card doesn't get cooled properly, right?), I don't think that is the problem for me because

- it happens mostly when playing movies with VLC or mpv. However, I have tested the card with a couple of games (Portal and Trine 2) and, although these are not very demanding games, it works like a charm.
- the problem doesn't occur in Windows (when playing the same movie with VLC)
- with the nouveau drivers it works just fine
- the card is actually very light, and doesn't droop much. Also about this, I would think that those kind of problems depend on the particular manufacter, not on the nvidia hardware. My card is MSI, but I don't know if that's good or bad...

Maybe, it is the same problem you had but it seems far-fetched to me :roll:

Anyway, thanks for sharing your experience.

Cheers!!
Back to top
View user's profile Send private message
F1r31c3r
Tux's lil' helper
Tux's lil' helper


Joined: 31 Aug 2007
Posts: 107
Location: UK

PostPosted: Wed Nov 18, 2015 9:03 am    Post subject: Reply with quote

mani001 wrote:
If you are implying it's a temperature thing (because the card doesn't get cooled properly, right?), I don't think that is the problem for me because

- it happens mostly when playing movies with VLC or mpv. However, I have tested the card with a couple of games (Portal and Trine 2) and, although these are not very demanding games, it works like a charm.
- the problem doesn't occur in Windows (when playing the same movie with VLC)
- with the nouveau drivers it works just fine
- the card is actually very light, and doesn't droop much. Also about this, I would think that those kind of problems depend on the particular manufacter, not on the nvidia hardware. My card is MSI, but I don't know if that's good or bad...

Maybe, it is the same problem you had but it seems far-fetched to me :roll:

Anyway, thanks for sharing your experience.

Cheers!!


No it is the cause of the problem. It causes damage. The error can be worked around via the driver.
If you watch your logs you will see errors, these cause random odd instability behaviour. Sometimes it will work perfectly and out the blue it will fail totally randomly.

Nvidia do a software tool to take the XID codes and or errors in diagnosing errors. Here is what i found:

Initial errors:

Quote:
Errors puked out are:

NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context

NVRM: Xid (PCI:0000:82:00): 32, Channel ID 00000003 intr 00004000


Then i started to change the PCIe bus speed with the driver options.

Quote:
switched the PCIe bus speeds from 5GTs to 8GTs to see what changes for the card.

With it operating in 8GTs and IOMMU set to off the errors are:

01/23/15 03:52:46 PM kernel [ 519.705948] NVRM: GPU at PCI:0000:82:00: GPU-*****************************
01/23/15 03:52:46 PM kernel [ 519.705955] NVRM: Xid (PCI:0000:82:00): 32, Channel ID 00000003 intr 00004000
01/23/15 03:52:46 PM kernel [ 519.706110] NVRM: Xid (PCI:0000:82:00): 32, Channel ID 00000003 intr 00004000
01/23/15 03:52:56 PM kernel [ 529.961478] NVRM: Xid (PCI:0000:82:00): 62, b6a8(1f88) 00000000 00000000



You do not get this kind of output in windows.

Different driver versions behave differently no idea why.

When i changed the card to a GTX970 the problem was solved. A replacement GTX770 i got also fixed the problem briefly but it returned a month or so later.
Tech support tried to pass it off as bad memory or bad motherboard which it clearly was not.

Here is the first full response i got back from tech support, yes they were wrong as when i bought a new GTX970 and plugged it in to the same computer with the same hardware, problem was solved...

Quote:
Hello,

This is a very odd error. From what I find, this appears to be an issue perhaps with either your motherboard or your RAM. I would test your system with one stick of RAM, cycle between the other sticks, if you have any.

Also, I would suggest testing this card in another system to rule out any potential issue with software, as well. If you find that the card works in another system, then there is definitely an issue with your build.


If you really can not make head or tails of the logs and debug output then if you have a friend with a different GTX card, try that in the system and see if you still see the errors etc.

Regarding manufacturer, vendors such as MSI, Asus, EVGA etc do not solder up or etch the cards. They are limited to what they can do to the card. These things are mods to clock speeds, heat sink and fans, logo designs and so on but the basic card and its chips used are all the same regardless of vendor.

Either way this is what my issue was and how i found and fixed it. It could well be a firmware/GPU-BIOS issue. The thought had crossed my mind but the firmware for these cards to re-flash the card is not available or is not available for my card model anyway.

That is about all i can say about it really. The problem was with the card itself and not anything else. If you are using the latest driver and it happens on your 700 series like it did with my 770 and note we have had multiple driver updates since December 2014 when my problem began then either it is a bug in the driver that has not been fixed, firmware becoming corrupt or damage to the card due to overheating.
Your guess is as good as mine.

My conclusion is the 700 series seems to have issues.
_________________
A WikI, A collection of mass misinformation based on opinion and manipulation by a deception of freedom.
If we know the truth, then we should be free from deception (John 8:42-47 )
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Desktop Environments All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum