Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Mysterious random system freezes
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Thu Oct 22, 2015 7:23 pm    Post subject: Mysterious random system freezes Reply with quote

Hi there,

For a while now I have been dealing with occasional and inexplicable freezes. By inexplicable I mean they occur during situations where I am not using a lot of resources, such as watching an .avi, which has never caused a problem before. When these freezes occur the only solution is a hard reset. The mouse doesn't work, I cannot get into a console, etc.

I recently upgraded my HDD, but I had these problems before that. I am running on an old x86 system with 4 gigs of RAM.

The results of swapon are and cat /proc/meminfo are as follows:

Code:

# swapon
NAME      TYPE      SIZE USED PRIO
/dev/sda3 partition 512M   0B   -1
/swapfile file      3.7G   0B   -2

 # cat /proc/meminfo
MemTotal:        3764316 kB
MemFree:         2672640 kB
MemAvailable:    3462860 kB
Buffers:          136404 kB
Cached:           638752 kB
SwapCached:            0 kB
Active:           501520 kB
Inactive:         493800 kB
Active(anon):     225364 kB
Inactive(anon):      716 kB
Active(file):     276156 kB
Inactive(file):   493084 kB
Unevictable:       13820 kB
Mlocked:           13820 kB
HighTotal:       2891224 kB
HighFree:        2176612 kB
LowTotal:         873092 kB
LowFree:          496028 kB
SwapTotal:       4430528 kB
SwapFree:        4430528 kB
Dirty:               132 kB
Writeback:             0 kB
AnonPages:        234012 kB
Mapped:           122420 kB
Shmem:              1256 kB
Slab:              54860 kB
SReclaimable:      46168 kB
SUnreclaim:         8692 kB
KernelStack:        1600 kB
PageTables:         1528 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6312684 kB
Committed_AS:     808416 kB
VmallocTotal:     122880 kB
VmallocUsed:       43800 kB
VmallocChunk:      69492 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       4096 kB
DirectMap4k:       49144 kB
DirectMap4M:      860160 kB


I'm happy to post any other stuff that might help, i.e., .config, dmesg output, etc.

Alex
Back to top
View user's profile Send private message
Polyatomic
n00b
n00b


Joined: 18 May 2014
Posts: 36

PostPosted: Thu Oct 22, 2015 8:51 pm    Post subject: Reply with quote

Definately not getting swap storms as I'm seeing you have
Code:
MemFree:         2672640 kB


Is there a reason your thinking swap is related. Think there
are bugs in various 4.1 kernels, given that _is_ what your running.

I can see you have been around these forums a long time, I
imagine you have lots of knowledge on linux. All I can recommend
is a careful observation of your .config and rebuild.

Lockups sure are a pain in the ass. )
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Thu Oct 22, 2015 8:58 pm    Post subject: Reply with quote

Hi there,

I just posted anything memory related. I have no idea what's causing things. Is there a more recent kernel that doesn't suffer from bugs? Also, I am running the 4.0.5 kernel.

Best,

Alex

Polyatomic wrote:
Definately not getting swap storms as I'm seeing you have
Code:
MemFree:         2672640 kB


Is there a reason your thinking swap is related. Think there
are bugs in various 4.1 kernels, given that _is_ what your running.

Lockups sure are a pain in the ass.
Back to top
View user's profile Send private message
davidm
Guru
Guru


Joined: 26 Apr 2009
Posts: 557
Location: US

PostPosted: Thu Oct 22, 2015 9:37 pm    Post subject: Reply with quote

Without some kind of log entry or common occurrence it is very hard to track down this sort of thing. I take it REISUB/magic SYSREQ (assuming you have it enabled) isn't working either? If so there is a way to dump the logs which might help. https://en.wikipedia.org/wiki/Magic_SysRq_key

It's always a good idea to rule out memory and heat problems with this sort of thing so check your temps to see if anything is amiss and stress test your memory with something like memtest. Power supply problems are also common with this sort of thing and especially as I am assuming this is an older system. Also it isn't unheard of for a GPU to fail and cause problems like this. I had this happen in the past with old hardware where some capacitors on my old Nvidia GPU were bad and causing freezes.
Back to top
View user's profile Send private message
mir3x
Guru
Guru


Joined: 02 Jun 2012
Posts: 455

PostPosted: Fri Oct 23, 2015 2:02 pm    Post subject: Reply with quote

If u using some strange programs in background all the time ( eg. torrents, some brigtness control, etc ) try disabling them for a week or so to narrow problem.
_________________
Sent from Windows
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Fri Oct 23, 2015 2:36 pm    Post subject: Reply with quote

Things like that are hard to find if you seek them without clue where to search.

But you can easy get clue of a general "where should i search" path.

Get a livedvd (for this case, an old one will be even better ; by old one, i don't mean an old one as old as dinosaurs, just a not brand new).
Run it, run your avi on it in a loop....
And.... wait (until you get bored or get your answer)....

And your answer will be:
If you get the freeze -> your hardware is in trouble
If you get none -> look for software trouble.

It's of course not the solve to your problem, but you'll get answer to "what should i look at".
It's as easy as that (well, not as easy as that, as you may have special case, but even the famous Mr. Holmes quote doesn't mean Holmes didn't look at the possible first before trying to eliminate the impossible).
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Sun Oct 25, 2015 10:18 am    Post subject: Reply with quote

Hi all,

I doubt the graphics card is the problem as I recently bought and installed a GT610.

It doesn't only happen when playing .avis; I've had the problem emerge when I was away and not doing anything save ssh'ing into the box. It's happened during web browsing, too, so it's not anything too stressful for the system.

Is there a good way to rule out software?

Best,

Alex
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Sat Nov 28, 2015 4:07 pm    Post subject: Reply with quote

Hi all,

Many thanks for following up on this. Sorry it's taken me a while to get back. Work has kept me too busy to attend to the problem, which, incidentally, is still occurring.

One thing I did recall, and this gives you some idea of how old my computer is, was that there are some indicator lights on the back. Thus, when it froze up yesterday, I looked at the lights and something was wrong. According to a manual that I found online the pattern means:

Quote:

A possible expansion card failure has occurred.


Might this mean it's as simple as a card not being settled in correctly or something not making contact to something else?

Best,

Alex
Back to top
View user's profile Send private message
Aquous
l33t
l33t


Joined: 08 Jan 2011
Posts: 700

PostPosted: Sun Nov 29, 2015 1:32 pm    Post subject: Reply with quote

Are you using nouveau? I have had random system freezes using the nouveau driver on my pc (running an nvidia 8600 GT)
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Sun Nov 29, 2015 2:09 pm    Post subject: Reply with quote

Aquous wrote:
Are you using nouveau? I have had random system freezes using the nouveau driver on my pc (running an nvidia 8600 GT)


I was using it briefly when I still had my old nvidia card. Ever since I switched to a more recent card, I went back to nvidia. It's not in my kernel or built as a module. Anything else I should look for?

Best,

Alex
Back to top
View user's profile Send private message
stayka
n00b
n00b


Joined: 19 Jun 2004
Posts: 16
Location: Oberhausen

PostPosted: Tue Dec 01, 2015 12:12 am    Post subject: Reply with quote

I experienced mysterious system hangups a while ago, too. In that time I could track the problem back to a version of syslog-ng and it disappeared when I went back to an older version of syslog-ng.
Update3a: The current version 3.7.2 of syslog-ng doesn't give me any problems anymore.

Since yesterday, the problems reappeared on my machine, though. When I emerged libreoffice, the load average virtually exploded to up to 35 (!!) which turned the system practically unresponsive. Killing the make and emerge processes let the system recover again (you don't want to know how long it took me to get the kill command through XD)

I'm right now dong a fresh world update in hopes that there was some problematic package in the one I did two days ago, while I'm monitoring the system load closely to see what program or package might suck up all the memory/cpu. (I shut down everything on the machine but X and the emerge and even dug out my good old notebook to write here).

So far things look still okay after 33 of 60 packages (I was a bit surprised to see so many new packages after just two days, that's why I hope the problem was located in one of them).

Update: Okay. 59 of the packages safely emerged so far. Now libreoffice is due as last item. I'm curious if it will go through now or if I get into problems again

Update2: This time the whole emerge went through without any problems or system freeze. I will continue to monitor my system closely to see whether I will have any new instances of system overload/freeze and keep you posted.
Update 3b: So far everything looks fine now.
Back to top
View user's profile Send private message
evoweiss
Veteran
Veteran


Joined: 07 Sep 2003
Posts: 1678
Location: Edinburgh, UK

PostPosted: Tue Dec 01, 2015 8:37 am    Post subject: Reply with quote

The freezes I've experienced occur with no more than firefox running a single instance and without a lot of heavy video, etc. I've also experienced them when I was doing nothing more than being logged in via ssh and checking email via pine. So I suspect it's a different situation than what you're experiencing.

Best,

Alex

stayka wrote:
I experienced mysterious system hangups a while ago, too. In that time I could track the problem back to a version of syslog-ng and it disappeared when I went back to an older version of syslog-ng.
Update3a: The current version 3.7.2 of syslog-ng doesn't give me any problems anymore.

Since yesterday, the problems reappeared on my machine, though. When I emerged libreoffice, the load average virtually exploded to up to 35 (!!) which turned the system practically unresponsive. Killing the make and emerge processes let the system recover again (you don't want to know how long it took me to get the kill command through XD)

I'm right now dong a fresh world update in hopes that there was some problematic package in the one I did two days ago, while I'm monitoring the system load closely to see what program or package might suck up all the memory/cpu. (I shut down everything on the machine but X and the emerge and even dug out my good old notebook to write here).

So far things look still okay after 33 of 60 packages (I was a bit surprised to see so many new packages after just two days, that's why I hope the problem was located in one of them).

Update: Okay. 59 of the packages safely emerged so far. Now libreoffice is due as last item. I'm curious if it will go through now or if I get into problems again

Update2: This time the whole emerge went through without any problems or system freeze. I will continue to monitor my system closely to see whether I will have any new instances of system overload/freeze and keep you posted.
Update 3b: So far everything looks fine now.
Back to top
View user's profile Send private message
stayka
n00b
n00b


Joined: 19 Jun 2004
Posts: 16
Location: Oberhausen

PostPosted: Tue Dec 01, 2015 9:27 am    Post subject: Reply with quote

I'm not sure about this. I thought a bit about what I had running when the freeze appeared during the libreoffice compile - there I had a firefox running, too.

I'm currently running top to continuously monitor the system load, and it appears that the load goes up (though currently still in harmless levels) as soon as I use firefox to visit pages with a large number of scripts on them. I have the noscript addon active, but sometimes it is still necessary to allow scripts so that a site is usable, and it seems I get spikes when a site is pretty script heavy and I allowed some of the script stuff.

So I wonder right now if this freeze might actually be a problem with firefox in the first place (ATM I'm running Mozilla Firefox 38.4.0).

Could you mayhap take a look if something like this occurs on your system, too?

As for the ssh and pine thing... Hm. Dunno. Maybe it is the same problem, but it might also be something different. In medicine one says "you can always have lice *and* fleas"...

Update: I just noticed it again - when I visited a website with firefox where I had given some of the connected domains access via noscript, the CPU load spiked and the load average shot up from the usual around 0.20 to almost 2, with the firefox process hogging practically all of the CPU. After closing the tab with that site, the load immediately went down again. Unfortunately when I visited the site once more, the problem didn't re-occur. But it seems this might be something one should monitor.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3146

PostPosted: Tue Dec 01, 2015 9:16 pm    Post subject: Reply with quote

I have noticed similar freezes after I switched to kernel 4. I can't recall it happening with kernel 3, but some other reasons make me a bit unhappy about going back.

At the beginning I blamed it on graphic drivers, however it happens with radeon as well as with fglrx. It froze when it was building kernel and it froze a few hours after building libreoffice when I was away. Can't see any pattern regarding the trigger. There is no warning. It always looks exactly the same after it happens though:
It's not kernel panic (screen just freezes and sound goes with that looped ring buffer for a few seconds)
It does not react to SysRq (and yes, I have it enabled in kernel)
It does not react to USB events like plugging devices (basically the power line is up and that's it)
It does not replay to network traffic
Of course nothing is logged in any way so there is nothing to grep through looking for hints.
And the next freez can happen in 5 minutes as well as in 5 weeks.

I've been thinking about migration to virtual, removing everything but SATA driver, DM and KVM from host and providing the other devices to the guest via PCI passthrough to (hopefully) contain the bug in a box I could look at from outside after it breaks. What discourages me is the necessity to do the migration in a single step and the fact that I have no idea what to look for should the VM hang with host remaining operational. :roll:

A few people more and we could start looking for common factors.

Quote:
if this freeze might actually be a problem with firefox in the first place (ATM I'm running Mozilla Firefox 38.4.0).
A userspace application killing more than just itself (And perhaps X) would be a serious design flaw and it would be quite exploitable. Not very likely, since we haven't heard of proof-of-concept virus by some_well_known_antivirus_Ka...mpany (you know, those guys who patched kernel in their lab to introduce a bug they layer used to escalate permissions). Userland is far away enough it should simply not be able to take down the kernel itself.
However, since you mentioned it I'm using Seamonkey 2.38
Back to top
View user's profile Send private message
stayka
n00b
n00b


Joined: 19 Jun 2004
Posts: 16
Location: Oberhausen

PostPosted: Tue Dec 01, 2015 10:16 pm    Post subject: Reply with quote

Well, the freeze stuff is definitely not kernel 4 related as I'm still using kernel version 3.12.13 (yeah, I'm lazy). In my case it first appeared when I installed syslog-ng 3.6.2. After I recovered the system from the freeze, I went back to syslog-ng version 3.4.8 and everything was fine until 2 days ago. That was when my system froze during the libreoffice compile despite me having excluded the newer syslog-ng versions, so right now it seems as if it has to be something that was already in kernel 3 and it doesn't really seem that syslog-ng is the culprit.

As for the firefox thing... Well, I always thought that the system was safe from user programs, so I'm currently very puzzled. It just happens that firefox once in a while (which unfortunately is not reliably reproduceable) does seem to suck up quite a lot of cpu capacity on my machine, but this seems to be a rather recent occurrence and I hope that the next version will behave better again.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum