Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Kernel & Hardware
  • Search

Machine is freezing

Kernel not recognizing your hardware? Problems with power management or PCMCIA? What hardware is compatible with Gentoo? See here. (Only for kernels supported by Gentoo.)
Post Reply
Advanced search
21 posts • Page 1 of 1
Author
Message
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

Machine is freezing

  • Quote

Post by ScR4tCh » Tue Jul 14, 2009 3:13 am

Hi Folks.
I am not able to find any solution for my current problem.
My new Desktop Machine (AMD Phenom II 955 4x, MSI 790GX-G58,GB RAM, ...) freezes in irregular intervals, sometimes it resets on it's own.
I am using gentoo-sources 2.6.30-r2, haven't tried vanilla yet.
By freezing I mean complete lock-ups, no console, no X not even sysrq. Also /var/log/messages and dmesg don't show kernel faults or similar severe logs.
I am relatively sure that it is no temperature issue. I checked the temperature after each freeze (resetting). Normally CPU and Sys temp need some time to cool down, but every thing seems to be within normal range (~45°C)
I've tried several kernel configurations as well as genkernel to eliminate possible configuration faults of mine, but nothing seems to work.
Even a bios update didn't change anything. All components are completely new. I don't think that it could be an PSU issue (due to the fact, that these crashes seem to happen in irregular intervals).Broken memory could be a reason, too, I'm just running memtest.
My question is, if anybody of you guys had similar problems with multicore amd systems (it is my first MC system).
Or if there are possible hints besides of gentoo docs and well-known (as procesor type,cflags,kernel options).

Thanks in advance!
Top
judepereira
Apprentice
Apprentice
User avatar
Posts: 179
Joined: Sat Jan 19, 2008 4:25 am
Location: Portage, yes, somewhere out there
Contact:
Contact judepereira
Website

  • Quote

Post by judepereira » Tue Jul 14, 2009 4:11 am

If it freezes X, just try ti use vesa and see it happens again. If not, then the problem would be with your video driver
Jude Pereira
(http://judepereira.com)
Top
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

  • Quote

Post by ScR4tCh » Tue Jul 14, 2009 9:02 am

Hi,
I already considered it, because i do have a nvidia card installed. But wouldn't just X stop working or wouldn't my kernel log some fault if only the vc causes the crashes ? As said, even sysrq doesn't show any effect nor anything other (no console, no ssh no nothing).
In the meantime i kept memtest running (4 passes so far) and there were no errors so far.
But thanks anyway,I'll give it a try.
Top
pappy_mcfae
Watchman
Watchman
User avatar
Posts: 5999
Joined: Thu Dec 27, 2007 10:51 pm
Location: Pomona, California.
Contact:
Contact pappy_mcfae
Website

  • Quote

Post by pappy_mcfae » Tue Jul 14, 2009 9:35 am

Is it a laptop or desktop? Do you have another OS installed? Does it work with that OS? Does your machine work with a different kernel version? Have you opened it up to see if the CPU heatsink might be clogged with dust, or whether or not the fan is running? If not, that would be a really good place to start.

Blessed be!
Pappy
This space left intentionally blank, except for these ASCII symbols.
Top
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

  • Quote

Post by ScR4tCh » Tue Jul 14, 2009 10:01 am

It's a Desktop machine, no other OS,didn't try another kernel version yet (but vanilla latest will be my next choice to test with). I have opened it up, as said, all components are new I built it on my own , no dust, fan operational (I already eliminated temperature as cause).
Maybe it could be a chipset issue ? I didn't find much about running AMD 790gx under linux.

thx
Top
krinn
Watchman
Watchman
User avatar
Posts: 7476
Joined: Fri May 02, 2003 6:14 am

  • Quote

Post by krinn » Tue Jul 14, 2009 10:08 am

ScR4tCh wrote:(I already eliminated temperature as cause).
How you did? just because some bios report 45° ?
You should (if you can, on today's computer it should be the case) underclock your cpu.
Top
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

  • Quote

Post by ScR4tCh » Tue Jul 14, 2009 10:16 am

Well, yes and no ;).
I also used to cycle the speed down to 800Mhz (using powersave govenor), same effect. But I'll underclock it and test (after my other tests).
But you are right, bios might be lying about the temp ... sadly lm_sensors does not yet support k10 to validate it.
Top
energyman76b
Advocate
Advocate
User avatar
Posts: 2048
Joined: Wed Mar 26, 2003 11:31 am
Location: Germany

  • Quote

Post by energyman76b » Tue Jul 14, 2009 2:38 pm

the internal diode is broken anyway - that is why all mainboards have a standard temp sensor. Install lm_sensors, run sensors-detect.

But seriously, sudden reboot - sounds like triple fault. Freezing is either irq handler dead or kernel panic of the worst kind.

These are usually symptoms of:
bad ram
bad psu

sometimes increasing ram voltage a bit can help (I have a ram stick unastable at 1.80V and rock solid at 1.85V for example).

Sometimes simple removing the ram sticks and putting them back helps. Increase ram voltage a bit. Problem persists? Try different ram, then different psu.
Study finds stunning lack of racial, gender, and economic diversity among middle-class white males

I identify as a dirty penismensch.
Top
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

  • Quote

Post by ScR4tCh » Wed Jul 15, 2009 1:20 am

Hi, here's the current "state". Switched to kernel 2.6.29-r5 (genkernel) and (gentoo) stable nvidia drivers. No change, System freezed anyway.
I re-checked the memory manufacturer's commendation about voltage (between 1.7 and 1.8 Volts), and changed from auto to 1.71V.
The sys is up for 7.5 hours now at nearly the same workload then before, no faulty behaviour so far .... hope dies last ;).
Funnily enough, "free" shows much lesser caching then before (I did forget to mention that I had heavy memory consumption and quick rising "cached" value. I have no idea if this was a kernel issue or was mem related (my other systems : 2.6.29x gentoo, 2.6.2x kubuntu both amd64 as well) did not show such a behaviour, never ever.
As for temperature sensors, lm_sensors (sensors-detect) marks the CPU temp sensor as "to be implemented", so there is no way to totally eliminate overheating. I found a kernel patch activating the support, but at this time I'm not willing to risk more instability.

I'm curious if the machine will be up tomorrow morning or be freezed .... Thanks for all replies so far, I hope I can stop bugging you with this problem soon ;)
Top
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

  • Quote

Post by ScR4tCh » Wed Jul 15, 2009 6:59 am

Shhhh.... It freezed again, ~2,5h after my post as it seems. Now I increased voltage to 1.76V (maximum) and give it a nother try. Next I'm going to check the PSU. Maybe just change it and have a look ....
Top
energyman76b
Advocate
Advocate
User avatar
Posts: 2048
Joined: Wed Mar 26, 2003 11:31 am
Location: Germany

  • Quote

Post by energyman76b » Wed Jul 15, 2009 7:09 pm

which Voltage? CPU?

Don't touch that! Only raise memory voltage - and only a small amount!
Study finds stunning lack of racial, gender, and economic diversity among middle-class white males

I identify as a dirty penismensch.
Top
aricart
n00b
n00b
User avatar
Posts: 16
Joined: Mon Jun 15, 2009 12:12 pm

  • Quote

Post by aricart » Wed Jul 15, 2009 7:39 pm

The problem you are experiencing may indeed be a thermal issue. This goes double if the fans aren't being activated to cool things down for some reason. You may want to use those patches to gain thermal support, and then work from there.
Top
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

  • Quote

Post by ScR4tCh » Wed Jul 15, 2009 7:41 pm

Hell no, I'd never touch CPU Voltage. I'm not a total hardware noob ;) . The thing I did was to change the Memory Voltage from "Auto" to first 1.71V after the next crash (approx 10 hours after that) I raised it to the next possible level 1.76V (regarding the manufacturer, these modules are working between 1.7V and 1.8V, so I exhausted all possebilies). 1.76V however let to instability and I had a freeze after several minutes.
Now with 1.71V it is running again.
I also rechecked if all power connectors are sitting tightly ,pulled out the modules and placed them back in another order.

the System is running sice ~9 hours so far ... . I Also found a k10temp module and I am now able to see my CPU-temperature, its ok, nearly constant at 42°C, so I finally guess that there is no severe temperature problem ( The box itself is relatively cool inside).
Top
cheater512
Tux's lil' helper
Tux's lil' helper
Posts: 145
Joined: Mon Nov 03, 2003 8:37 am
Location: Australia
Contact:
Contact cheater512
Website

  • Quote

Post by cheater512 » Mon Jul 20, 2009 2:43 am

I'll throw my 2c in. :)

My brand new Phenom X3 also freezes, but very rarely. Perhaps once a week?
Same symptoms. Nothing in the logs. I did run memtest and it came back clean.

With a Kubuntu install disc I was lucky to have a couple of hours. Not sure what was with that.

Temperatures are all around the 40 - 50C mark when idle.
They can start going to just over 50C when under load.
I also have a heater in the room sometimes (its Winter in Aus) but at ground level the air is still cool.

Its not a big deal for me. Just a quirk.
Top
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

  • Quote

Post by ScR4tCh » Mon Jul 20, 2009 10:50 am

Ok, the machine keeps freezing, but I think that I've probably found the reason (after reading and reading).
It may be a nvidia problem, I got serveral Xid Erroros in the messages and found articles about an "old" bug and possible problems using nvidia drivers along with mutlicore machines.
The first thing I did was to turn composite off and the machine kept running for 24 hours. After playing nexuiz for a while, the system crashed again (after leaving the game).
So I'll try to stay away from opengl applications for a while to verify it.

@cheater512:
Could it be the same problem with your machine ? Are you running a nvidia card with closed-source drivers ? Did you encounter any "Xid" messages after or while running opengl capable apps ?
Top
cheater512
Tux's lil' helper
Tux's lil' helper
Posts: 145
Joined: Mon Nov 03, 2003 8:37 am
Location: Australia
Contact:
Contact cheater512
Website

  • Quote

Post by cheater512 » Mon Jul 20, 2009 1:25 pm

Nope that cant be the same for mine.
My box has a AMD chipset (ATI is everywhere in my lspci) with a ATI Radeon HD4830.
OpenGL works fine (abit slowly - open source drivers) with no errors.

Could be that your nVidia stuff is a symptom rather than the cause or your problem completely unrelated to mine.
Top
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

  • Quote

Post by ScR4tCh » Mon Jul 20, 2009 7:24 pm

Okay, ... so the search continues ... . So far it's running quite well ... . The next time it freezes I'm gonna pull out the nvidia board and try my luck with the onboard ati card.I'm really on my last nerve with this problem ... .
Top
pappy_mcfae
Watchman
Watchman
User avatar
Posts: 5999
Joined: Thu Dec 27, 2007 10:51 pm
Location: Pomona, California.
Contact:
Contact pappy_mcfae
Website

  • Quote

Post by pappy_mcfae » Tue Jul 21, 2009 7:02 am

Did you turn off the internal ATI card in the BIOS?

Blessed be!
Pappy
This space left intentionally blank, except for these ASCII symbols.
Top
360soso
n00b
n00b
Posts: 1
Joined: Tue Jul 21, 2009 7:06 am

  • Quote

Post by 360soso » Tue Jul 21, 2009 7:12 am

The limitation of this method:
1. It is impossible for the manufacturers to add all the controllers you need to the database since there are too many of them, not to mention the new controllers keep coming out; users will find there are just limited controllers supported; and once the target controller is not included in the database, users can’t do nothing but give it up.
2. The controller emulator matches the controller by mainly the controller model, which may lead to a false match because even controller chips with the same model number may contain different contents, especially when they are not manufactured by the same factory at the same time.

from:http://www.xlycn.com
Top
ScR4tCh
n00b
n00b
User avatar
Posts: 10
Joined: Tue Jul 14, 2009 2:47 am

  • Quote

Post by ScR4tCh » Tue Jul 21, 2009 9:31 am

@pappy_mcfae
Well, that is the problem, there is no possibility to turn the card off, i just can set the "first" Graphics Adapter.

@360soso
I'm not really sure what you are talking about, sorry ;), Which database ?

In the meantime I tried my luck with manually setting memory timings to the values given by the manufacturer, It seem to run stable ... again.
I did found some posts in other forums describing similar problems with 790gx chipsets and phenom II multicore processors. There was one solution to just use RAM bank 3 and 4 that worked for the user , but this makes no sense for me at all.
There are also some older news describing a lockup bug in AMD processors (but for older phenom cores).
The Xid Errors however seem to be a nvidia-only problem (also found and verified several bugs, for instance running SWT Applications with tray icon, which are "destroying" plasma and leading to Xid outputs in messages).

Uptime ~12h ... Again I'm very curios about how long it'll stay alive today.


Have a nice day
Top
pappy_mcfae
Watchman
Watchman
User avatar
Posts: 5999
Joined: Thu Dec 27, 2007 10:51 pm
Location: Pomona, California.
Contact:
Contact pappy_mcfae
Website

  • Quote

Post by pappy_mcfae » Tue Jul 21, 2009 10:48 pm

ScR4tCh wrote:@pappy_mcfae
Well, that is the problem, there is no possibility to turn the card off, i just can set the "first" Graphics Adapter.
Yeah, that could well be the issue. I'm glad my BIOS allowed me to turn off the Intel GPU when I installed my nvidia. That eliminated any possibility of such issues. If the issues remain, try using the onboard video and see if that fixes things. If not, I'll go out on a limb and say you probably have mobo issues.

Blessed be!
Pappy
This space left intentionally blank, except for these ASCII symbols.
Top
Post Reply

21 posts • Page 1 of 1

Return to “Kernel & Hardware”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic