Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Computer freezing, no messages in syslog
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
moult
Retired Dev
Retired Dev


Joined: 31 Mar 2008
Posts: 146
Location: Australia

PostPosted: Thu Sep 24, 2015 1:31 pm    Post subject: Computer freezing, no messages in syslog Reply with quote

So, this has started happening rather regularly. Seems to be triggered by intense CPU and/or diskwriting activity. ThinkPad T420i. It'll freeze such that the caps lock light will start blinking, and if sound was playing, the sound would loop. No messages in syslog. No mouse or keyboard response. SysRq+REISUB doesn't do anything. Tried checking smartmontools on /dev/sda and it says that there aren't any issues. Laptop is about 3.5-4 years old.

What else can I do to debug/assess the issue?
_________________
thinkMoult - I write articles online. You might like some of them.
Planet Larry - do you write a blog and use Gentoo? Get your blog added to the Planet Larry Gentoo user blog aggregator!
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9679
Location: almost Mile High in the USA

PostPosted: Thu Sep 24, 2015 2:23 pm    Post subject: Reply with quote

The blinking capslock means that the kernel panicked. It looks like magic-sysreq doesn't work when it panics so that's at least consistent... I'd make sure RAM is good, else you'd probably have to somehow get the oops info, perhaps via serial console or something rather. These newer hardware must require special debug hardware...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Thu Sep 24, 2015 3:22 pm    Post subject: Reply with quote

Check your cooling, and clean out the insides, as well as memtest as indicated.
Back to top
View user's profile Send private message
moult
Retired Dev
Retired Dev


Joined: 31 Mar 2008
Posts: 146
Location: Australia

PostPosted: Fri Sep 25, 2015 9:45 pm    Post subject: Reply with quote

Cooling checked, (if it overheats, it'll shutdown, not kernel panic). Also just cleaned out the insides just to be sure. memtest86+ done, nothing found, although I did only have time to do one pass, I'll leave it to do more passes just to be sure.

I'm rethinking my original suspect of CPU activity - I've just compiled 330 packages (now at 75degC) without issue. Maybe cleaning out the insides helped (it didn't seem too dirty, though, a bit of fluff but not much).
_________________
thinkMoult - I write articles online. You might like some of them.
Planet Larry - do you write a blog and use Gentoo? Get your blog added to the Planet Larry Gentoo user blog aggregator!
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Sat Sep 26, 2015 12:58 am    Post subject: Reply with quote

Moult wrote:
Cooling checked, (if it overheats, it'll shutdown, not kernel panic).

It will not shutdown, it "may" or "should".

Intel cpu (from pentium3 era) are made to lower their clocking in order to protect themselves from heat, and just never issue any shutdown themselves.
the shutdown is a bios feature m/b may have or a software feature (i don't know what kernel part handle that). It was primary made for amd cpu that lack the feature (but i suppose they now have one too).
i had own an asus m/b for intel with the feature, but in this case, it's useless except for commercial purpose (it's even worst, as it is better to save your work at a shitty clock speed than an emergency shutdown without saving it).

Most amd cpu are getting mad from heat, and even with the feature to shutdown when too hot, they are creating errors under heat trouble, making most of the time the os freeze/crash/reboot before the shutdown could be really made. It also affect indirectly intel cpu too, because heat goes up, case get hot, and memory/video/scsi cards are crashing the os...

And even configured, if your shutdown trigger is set too high, it just never reach it, as the cpu->downclock itself->heat goes back to normal->cpu clock back to normal->too hot again->downclock... and your software/bios part is waiting to reach a temp it never reach.

As you see, it's really optimistic to assume a shutdown for heat trouble will be made, as even under heat pain, you will certainly never seen any.

Intel cpu report error thru MCE events (i think amd cpu are doing it too, just because i remember the mce amd feature is present in kernel conf, but i'm less sure mce heat errors are report by them or not)
If you want be sure, then check your mce status (kernel feature enable + app-admin/mcelog)

But honestly, you can just guess it
- working hard->crash/reboot/freeze = too hot cpu
- laptop -> small space = heat trouble
So a user reporting any trouble with a laptop working hard, you can put bet 99.99% time on heat.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Sat Sep 26, 2015 7:51 am    Post subject: Reply with quote

krinn wrote:
So a user reporting any trouble with a laptop working hard, you can put bet 99.99% time on heat.

Yup; especially as people don't usually clean them out, for fear of voiding a warranty, or that they won't be able to put it back together.
Back to top
View user's profile Send private message
moult
Retired Dev
Retired Dev


Joined: 31 Mar 2008
Posts: 146
Location: Australia

PostPosted: Sat Sep 26, 2015 10:49 pm    Post subject: Reply with quote

You're right, it is a temperature issue. But the plot thickens!

By default (bios feature or cpu feature, not sure) it'll auto-shutdown at 83degC. This happened way too often for comfort so I wrote my own script to adjust the CPU governor and fan speeds depending on the temperature. Recently I noticed that my script had hung - and the command `cpupower frequency-set -g powersave` (or whatever governor) would turn into a zombie process (kill -9 no worky) instead of properly setting the governor. I suspect this means there was an issue with the CPU that wasn't responding to the governor request, and so it worked itself too hard and kernel panicked.
_________________
thinkMoult - I write articles online. You might like some of them.
Planet Larry - do you write a blog and use Gentoo? Get your blog added to the Planet Larry Gentoo user blog aggregator!
Back to top
View user's profile Send private message
moult
Retired Dev
Retired Dev


Joined: 31 Mar 2008
Posts: 146
Location: Australia

PostPosted: Mon Sep 28, 2015 9:35 pm    Post subject: Reply with quote

Hmm - perhaps not just a temperature issue. I can trigger it without a temperature issue if I write a lot of data (>500mb) to an externally mounted sd card (vfat) or ext hard drive (ntfs). What does that imply?
_________________
thinkMoult - I write articles online. You might like some of them.
Planet Larry - do you write a blog and use Gentoo? Get your blog added to the Planet Larry Gentoo user blog aggregator!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum