View previous topic :: View next topic |
Author |
Message |
hoacker Guru


Joined: 04 Aug 2007 Posts: 505 Location: Bürstadt, Germany
|
Posted: Thu Jan 09, 2014 6:47 pm Post subject: Kernel panic. Please help interpreting messages. How fix? |
|
|
Hi. I just had a kernel panic with messages you can see in the linked image. Can anyone tell me what that means and what I could try to do to fix the problem. Thank you in advance!
Machine: Laptop Fujitsu Lifebook T901
Kernel: 3.10.7-gentoo-r1
Messages: http://www.schwedenstuhl.de/tmp/E7D_8365.png |
|
Back to top |
|
 |
N8Fear Tux's lil' helper


Joined: 15 Apr 2013 Posts: 140 Location: Berlin (Germany)
|
Posted: Thu Jan 09, 2014 8:00 pm Post subject: |
|
|
MCE hints towards broken hardware. You should first try to boot another kernel (possibly a live cd/dvd/stick). If that's working your current installation/current kernel image likely is broken.
Most likely you'll hit the same (or a similar) issue.
In that case you'll have to take a look at your hardware. Likely candidates leading to such errors are CPU, RAM, Mainboard...
I've just encountered a similar issue caused by an empty CMOS battery (imho strange but replacing the battery fixed it for me). Hardware diagnosis in your case will likely be tough since there aren't that much things you can do to a laptop without professional help (e.g. replacement parts to cross check components).
If the issue can be resolved by changing kernel you may have luck with creating a new configuration... |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55212 Location: 56N 3W
|
Posted: Thu Jan 09, 2014 9:11 pm Post subject: |
|
|
hoacker,
At face value, the message means that your CPU has a dead core.
By all means try other kernels.
Also go into your BIOS and turn off all cores except core 0, which cannot be disabled.
Reboot. Enable cores one at a time until it fails again. Turn off the failing core permanently.
If the CPU is still under warranty, get it replaced. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
hoacker Guru


Joined: 04 Aug 2007 Posts: 505 Location: Bürstadt, Germany
|
Posted: Thu Jan 09, 2014 9:45 pm Post subject: |
|
|
NeddySeagoon wrote: | At face value, the message means that your CPU has a dead core. |
dead core?
During the day I use the laptop in my office with engineering applications under Windows with a large demand of CPU time (multiple cores) and RAM. I cannot remember that the machine had problems then. But I had hard locks with various Linux kernel far too often since I have the laptop (3 years or so), but never seen a message like that.
Well, business and portage made the CPU produce a lot of hot air. Could be that it's burned... Well, I'll see if there are more hiccup in the nearer future.
Thanks for the suggestions so far (N8Fear & Neddy). I'll switch back to my previous kernel and see what happens. |
|
Back to top |
|
 |
N8Fear Tux's lil' helper


Joined: 15 Apr 2013 Posts: 140 Location: Berlin (Germany)
|
Posted: Thu Jan 09, 2014 10:16 pm Post subject: |
|
|
A "known good" kernel is definitely worth a try. If that works you could also try to configure a new one from scratch.
An MCE (the error you encountered) occurs when the CPU detects a hardware problem. Therefore the "does not look good"-outlook we gave you. Things that may cause such errors are overheating, problems with the power supply, bad contacts (e.g. a loose fitting CPU) or dead hardware - just if you want to investigate further... |
|
Back to top |
|
 |
aCOSwt Bodhisattva

Joined: 19 Oct 2007 Posts: 2537 Location: Hilbert space
|
Posted: Thu Jan 09, 2014 11:25 pm Post subject: |
|
|
Over-overclocking ? _________________
|
|
Back to top |
|
 |
hoacker Guru


Joined: 04 Aug 2007 Posts: 505 Location: Bürstadt, Germany
|
Posted: Thu Jan 09, 2014 11:32 pm Post subject: |
|
|
aCOSwt wrote: | Over-overclocking ? |
Naaa! Not a single Hz...  |
|
Back to top |
|
 |
Goverp Advocate


Joined: 07 Mar 2007 Posts: 2245
|
Posted: Fri Jan 10, 2014 12:42 pm Post subject: |
|
|
I tried running it through mcelog --ascii as the messages suggested, as follows:
Code: | acer ~#mcelog --ascii
CPU 3: Machine Check Exception: 5 Bank 4: b200000011000402
RIP !INEXACT! 10:<ffffffff813b44ed> {intel_idle+0x9d/0x100}
TSC 90b1a236131
PROCESSOR 0:206a7 TIME 1389290654 SOCKET 0 APIC 3 microcode 1a |
and got the following output
Code: |
Hardware event. This is not a software error.
CPU 3 BANK 4 TSC 90b1a236131
RIP !INEXACT! 10:ffffffff813b44ed
TIME 1389290654 Thu Jan 9 18:04:14 2014
MCG status:RIPV MCIP
MCi status:
Uncorrected error
Error enabled
Processor context corrupt
MCA: Internal unclassified error: 402
PCU: No error <24:11>
STATUS b200000011000402 MCGSTATUS 5
CPUID Vendor Intel Family 6 Model 42
RIP: intel_idle+0x9d/0x100}
SOCKET 0 APIC 3 microcode 1a |
Doesn't seem to have added much value, apart from confirming that it's really a hardware issue.
I guess the "Processor context corrupt" piece means that I'm running it on a different machine, so you might get better results running it on your own box, if you can get a Gentoo or other linux to boot.
Check that I've typed the diagnostic output in correctly too! _________________ Greybeard |
|
Back to top |
|
 |
hoacker Guru


Joined: 04 Aug 2007 Posts: 505 Location: Bürstadt, Germany
|
Posted: Fri Jan 10, 2014 1:17 pm Post subject: |
|
|
Thank you. Gentoo boots fine on that machine, the crash occured after 2 hours (or so) running. I'll try mcelog later at home, I guess I need to install it first.
5 minutes ago the machine beeped once, which it did never before (apart from during BIOS test when switched on). Could be another sign that I'll have to find a replacement soon... |
|
Back to top |
|
 |
hoacker Guru


Joined: 04 Aug 2007 Posts: 505 Location: Bürstadt, Germany
|
Posted: Sat Jan 11, 2014 10:22 am Post subject: |
|
|
Goverp wrote: | you might get better results running it on your own box |
mcelog gives me the same output. |
|
Back to top |
|
 |
|