Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
i915: GPU hung, declared wedged.... tips?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
WvR
Tux's lil' helper
Tux's lil' helper


Joined: 03 Mar 2011
Posts: 111
Location: Tsuruga, Japan

PostPosted: Fri Feb 01, 2013 12:05 am    Post subject: i915: GPU hung, declared wedged.... tips? Reply with quote

I use a Lenovo ThinkPad X201i with gentoo with full satisfaction. However, recently I am experiencing some issues:

Problem: when working with Gnome (v3.6.x), at some point the interface freezes. I can move the mouse, and the cursor will move over the screen, but for instance the clock is frozen. After several seconds, the active window is blacked out. Then, I get the black screen that says "Sorry, something has gone wrong and the system cannot recover. Call a system administrator".

The keyboard is responsive. I use CTRL-ALT-F1 to get to a tty, log in as root, and restart XDM. Then, X will restart, but before the GDM login screen appears, I get the same error message: "Sorry, something has gone wrong and the system cannot recover. Call a system administrator"

After trying several things, I have a feeling that it is an issue with the Intel i915 driver. A snippet from /var/log/messages

Code:

Jan 31 16:04:17 rine50 kernel: [ 5887.075138] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Jan 31 16:04:17 rine50 kernel: [ 5887.075145] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
Jan 31 16:04:19 rine50 kernel: [ 5888.691015] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
Jan 31 16:04:19 rine50 kernel: [ 5888.691326] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged!
Jan 31 16:04:19 rine50 kernel: [ 5888.691334] [drm:i915_reset] *ERROR* Failed to reset chip.


Browsing in Google, I discovered this error, but in all cases with linux kernel 2.6.3x. I use 3.7.4. A more recent message pointed to a buggy combination of BIOS and hardware on a particular type of Intel motherboard. But in my case, the laptop was fine until now. Am I looking at broken hardware? If so, how to find out? Or can I somehow switch off the DRM and use the "old-style" Gnome instead of the "new-style" Gnome. I have used my laptop also with Xsession (twm) - in this case the error does not occur but I have not used the system long enough with twm to make a definitive conclusion.

Any tips are welcome. Is there a way to "stress test" the GPU?
Back to top
View user's profile Send private message
BillWho
Veteran
Veteran


Joined: 03 Mar 2012
Posts: 1576
Location: US

PostPosted: Fri Feb 01, 2013 1:15 am    Post subject: Reply with quote

WvR,

Did you check /sys/kernel/debug/dri/0/i915_error_state :?: Maybe some better clues there.
_________________
Good luck :wink:

Since installing gentoo, my life has become one long emerge :)
Back to top
View user's profile Send private message
WvR
Tux's lil' helper
Tux's lil' helper


Joined: 03 Mar 2011
Posts: 111
Location: Tsuruga, Japan

PostPosted: Fri Feb 01, 2013 6:41 am    Post subject: Reply with quote

It happened again.... This time I copied the i915_error_state. It does not give much help. It is a very long list of register contents in hexadecimal form.

Just today the intel driver (xf86_video_intel) was updated but apparently that does not help.....
Back to top
View user's profile Send private message
WvR
Tux's lil' helper
Tux's lil' helper


Joined: 03 Mar 2011
Posts: 111
Location: Tsuruga, Japan

PostPosted: Fri Feb 01, 2013 7:35 am    Post subject: Reply with quote

I found this thread

http://www.gossamer-threads.com/lists/linux/kernel/1617936

It seems that I am not the only one. I guess I will downgrade to 3.6.11 on the laptop (there is no real reason to use the ~amd64 kernel anyway)
Back to top
View user's profile Send private message
WvR
Tux's lil' helper
Tux's lil' helper


Joined: 03 Mar 2011
Posts: 111
Location: Tsuruga, Japan

PostPosted: Sat Feb 02, 2013 11:24 pm    Post subject: Reply with quote

Downgrading the kernel to 3.6.11 did not help. Yesterday evening two "crashes" in 10 minutes. The most irritating feature is that you have to restart the computer to solve it. Simply restarting X does not help because somehow the GPU cannot be "reset. Next try: downgrade the intel driver from 2.20.19-r1 to 2.20.13. Wish me luck...
Back to top
View user's profile Send private message
BillWho
Veteran
Veteran


Joined: 03 Mar 2012
Posts: 1576
Location: US

PostPosted: Sun Feb 03, 2013 12:59 am    Post subject: Reply with quote

WvR,

Did you add or change any settings in /etc/X11/xorg.conf.d/20-intel.conf :?:

Is DRM_I915 built-in or a module :?:

Have a look at x11-apps/intel-gpu-tools. Maybe some tests can provide a clue.
_________________
Good luck :wink:

Since installing gentoo, my life has become one long emerge :)
Back to top
View user's profile Send private message
WvR
Tux's lil' helper
Tux's lil' helper


Joined: 03 Mar 2011
Posts: 111
Location: Tsuruga, Japan

PostPosted: Mon Feb 04, 2013 9:28 am    Post subject: Reply with quote

No changes to anything. These problems seem to have started without a clearly identifiable cause. That is one of the reasons why I suspect hardware problems.

I downgraded xf86-video-intel to the stable version. Let's see if this brings any improvement.
Back to top
View user's profile Send private message
toralf
Advocate
Advocate


Joined: 01 Feb 2004
Posts: 2411
Location: Hamburg/Germany

PostPosted: Mon Feb 04, 2013 10:56 am    Post subject: Reply with quote

WvR wrote:
It happened again.... This time I copied the i915_error_state. It does not give much help.
Well, that content is not intended to be readable by a common user. Just file a bug here https://bugzilla.kernel.org and attach the content of that file.
Back to top
View user's profile Send private message
WvR
Tux's lil' helper
Tux's lil' helper


Joined: 03 Mar 2011
Posts: 111
Location: Tsuruga, Japan

PostPosted: Thu Feb 07, 2013 5:40 am    Post subject: [solved] i915: GPU hung, declared wedged.... tips? Reply with quote

Since I downgraded to x11-drivers/xf86-video-intel v2.20.13 the problem has not returned, so I am declaring it "solved" for the time being.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum