Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Kernel panic. Please help interpreting messages. How fix?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
hoacker
Guru
Guru


Joined: 04 Aug 2007
Posts: 505
Location: Bürstadt, Germany

PostPosted: Thu Jan 09, 2014 6:47 pm    Post subject: Kernel panic. Please help interpreting messages. How fix? Reply with quote

Hi. I just had a kernel panic with messages you can see in the linked image. Can anyone tell me what that means and what I could try to do to fix the problem. Thank you in advance!

Machine: Laptop Fujitsu Lifebook T901
Kernel: 3.10.7-gentoo-r1
Messages: http://www.schwedenstuhl.de/tmp/E7D_8365.png
Back to top
View user's profile Send private message
N8Fear
Tux's lil' helper
Tux's lil' helper


Joined: 15 Apr 2013
Posts: 140
Location: Berlin (Germany)

PostPosted: Thu Jan 09, 2014 8:00 pm    Post subject: Reply with quote

MCE hints towards broken hardware. You should first try to boot another kernel (possibly a live cd/dvd/stick). If that's working your current installation/current kernel image likely is broken.
Most likely you'll hit the same (or a similar) issue.
In that case you'll have to take a look at your hardware. Likely candidates leading to such errors are CPU, RAM, Mainboard...
I've just encountered a similar issue caused by an empty CMOS battery (imho strange but replacing the battery fixed it for me). Hardware diagnosis in your case will likely be tough since there aren't that much things you can do to a laptop without professional help (e.g. replacement parts to cross check components).
If the issue can be resolved by changing kernel you may have luck with creating a new configuration...
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 55212
Location: 56N 3W

PostPosted: Thu Jan 09, 2014 9:11 pm    Post subject: Reply with quote

hoacker,

At face value, the message means that your CPU has a dead core.
By all means try other kernels.

Also go into your BIOS and turn off all cores except core 0, which cannot be disabled.
Reboot. Enable cores one at a time until it fails again. Turn off the failing core permanently.
If the CPU is still under warranty, get it replaced.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
hoacker
Guru
Guru


Joined: 04 Aug 2007
Posts: 505
Location: Bürstadt, Germany

PostPosted: Thu Jan 09, 2014 9:45 pm    Post subject: Reply with quote

NeddySeagoon wrote:
At face value, the message means that your CPU has a dead core.

8O dead core?

During the day I use the laptop in my office with engineering applications under Windows with a large demand of CPU time (multiple cores) and RAM. I cannot remember that the machine had problems then. But I had hard locks with various Linux kernel far too often since I have the laptop (3 years or so), but never seen a message like that.

Well, business and portage made the CPU produce a lot of hot air. Could be that it's burned... :cry: Well, I'll see if there are more hiccup in the nearer future.

Thanks for the suggestions so far (N8Fear & Neddy). I'll switch back to my previous kernel and see what happens.
Back to top
View user's profile Send private message
N8Fear
Tux's lil' helper
Tux's lil' helper


Joined: 15 Apr 2013
Posts: 140
Location: Berlin (Germany)

PostPosted: Thu Jan 09, 2014 10:16 pm    Post subject: Reply with quote

A "known good" kernel is definitely worth a try. If that works you could also try to configure a new one from scratch.
An MCE (the error you encountered) occurs when the CPU detects a hardware problem. Therefore the "does not look good"-outlook we gave you. Things that may cause such errors are overheating, problems with the power supply, bad contacts (e.g. a loose fitting CPU) or dead hardware - just if you want to investigate further...
Back to top
View user's profile Send private message
aCOSwt
Bodhisattva
Bodhisattva


Joined: 19 Oct 2007
Posts: 2537
Location: Hilbert space

PostPosted: Thu Jan 09, 2014 11:25 pm    Post subject: Reply with quote

Over-overclocking ?
_________________
Back to top
View user's profile Send private message
hoacker
Guru
Guru


Joined: 04 Aug 2007
Posts: 505
Location: Bürstadt, Germany

PostPosted: Thu Jan 09, 2014 11:32 pm    Post subject: Reply with quote

aCOSwt wrote:
Over-overclocking ?


Naaa! Not a single Hz... :)
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2245

PostPosted: Fri Jan 10, 2014 12:42 pm    Post subject: Reply with quote

I tried running it through mcelog --ascii as the messages suggested, as follows:
Code:
acer ~#mcelog --ascii
CPU 3: Machine Check Exception: 5 Bank 4: b200000011000402
RIP !INEXACT! 10:<ffffffff813b44ed> {intel_idle+0x9d/0x100}
TSC 90b1a236131
PROCESSOR 0:206a7 TIME 1389290654 SOCKET 0 APIC 3 microcode 1a

and got the following output
Code:

Hardware event. This is not a software error.
CPU 3 BANK 4 TSC 90b1a236131
RIP !INEXACT! 10:ffffffff813b44ed
TIME 1389290654 Thu Jan  9 18:04:14 2014
MCG status:RIPV MCIP
MCi status:
Uncorrected error
Error enabled
Processor context corrupt
MCA: Internal unclassified error: 402
PCU: No error <24:11>

STATUS b200000011000402 MCGSTATUS 5
CPUID Vendor Intel Family 6 Model 42
RIP: intel_idle+0x9d/0x100}
SOCKET 0 APIC 3 microcode 1a

Doesn't seem to have added much value, apart from confirming that it's really a hardware issue.
I guess the "Processor context corrupt" piece means that I'm running it on a different machine, so you might get better results running it on your own box, if you can get a Gentoo or other linux to boot.
Check that I've typed the diagnostic output in correctly too!
_________________
Greybeard
Back to top
View user's profile Send private message
hoacker
Guru
Guru


Joined: 04 Aug 2007
Posts: 505
Location: Bürstadt, Germany

PostPosted: Fri Jan 10, 2014 1:17 pm    Post subject: Reply with quote

Thank you. Gentoo boots fine on that machine, the crash occured after 2 hours (or so) running. I'll try mcelog later at home, I guess I need to install it first.

5 minutes ago the machine beeped once, which it did never before (apart from during BIOS test when switched on). Could be another sign that I'll have to find a replacement soon...
Back to top
View user's profile Send private message
hoacker
Guru
Guru


Joined: 04 Aug 2007
Posts: 505
Location: Bürstadt, Germany

PostPosted: Sat Jan 11, 2014 10:22 am    Post subject: Reply with quote

Goverp wrote:
you might get better results running it on your own box

mcelog gives me the same output.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum