Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] Hard freeze upon KDE boot
View unanswered posts
View posts from last 24 hours

rackathon
 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
cfgauss
Tux's lil' helper
Tux's lil' helper


Joined: 18 May 2005
Posts: 145

PostPosted: Sun Aug 17, 2008 10:21 pm    Post subject: [SOLVED] Hard freeze upon KDE boot Reply with quote

During the last 48 hours my AMD64, 2.6.25-gentoo-r7, machine froze twice, the last time as soon as KDE was finished booting. The mouse and keyboard froze and I couldn't even ssh to it from another box. It acts as the gateway for other boxes at home and their Internet connection failed so my box's whole networking subsystem was down. When KDE boots, emacs 23.0.60.1, firefox 2.0.0.16, and thunderbird 2.0.0.16 boot automatically. There have been no hardware changes. This is the first time this has happened since the box was built three years ago.

I couldn't find anything in /var/log/messages which would give me a hint as to what happened. How can I debug this and, better yet, prevent it from happening again?

[EDIT] After 8 hours, memtest86 produced no errors so I finally opened the box. The BIOS reported the Northbridge fan was at 0 RPM and, sure enough, it wasn't moving. The MSI Forum reported that was enough for random crashes so I assume that's the culprit and ordered a replacement fan suggested on that forum. They claim that I can only expect about two years from any such fan. Many people order extras so they can swap them in on the spot.

Thanks for all the suggestions.[/EDIT]
[EDIT2] I installed a new Northbridge fan and it appears to be spinning. (Unfortunately it doesn't have a third wire so the RPM isn't reported by the BIOS.) But when I boot, my Gentoo Linux AMD64 froze within minutes. It's dual boot so I'm running Windows XP continuously. If it runs without a crash for three or so days, I'll assume that the hardware is OK.

Is the problem then a bad driver under Linux? If so, what's a good way of finding it so I can replace it or, perhaps, not use that piece of hardware?

I've love to get my box back again so any pointers are gratefully received.[/EDIT2]

[SOLVED] The frequency of freezing increased both under Linux and Windows. I built a new, spiffy Intel Core 2 Quad box with a new SATA hard drive. I copied over world, /home, and a bunch of config files from my old IDE hard drive, which still functions in the new box. After testing RAM, etc., in the old box, I suspect I had a failed Northbridge. Thanks for the hardware debugging pointers. [/SOLVED]


Last edited by cfgauss on Mon Oct 13, 2008 9:15 pm; edited 5 times in total
Back to top
View user's profile Send private message
bunder
Bodhisattva
Bodhisattva


Joined: 10 Apr 2004
Posts: 5150
Location: Hamilton, Ontario

PostPosted: Mon Aug 18, 2008 1:06 am    Post subject: Reply with quote

while being logged out, rm'ing /home/youruser/.kde/share/config/session/* should prevent kde from loading your apps when you log in. that won't solve the problem, but should tell you whether it's firefox/other that's causing the lockup.

some preliminary questions:
recent upgrade (kernel and software)?
ati or nvidia?
any services failing on boot?
maybe an emerge --info, lspci, dmesg?

cheers
_________________
bunder @ freenode | bunhax - tame your SMP!
gentoo shells | the remains of UFGO
Back to top
View user's profile Send private message
Hu
Veteran
Veteran


Joined: 06 Mar 2007
Posts: 2595

PostPosted: Mon Aug 18, 2008 8:10 pm    Post subject: Reply with quote

Are you using the open source drivers for your video card or the proprietary ones from your card vendor? While gathering the information bunder wanted, please also post the output of nl /etc/X11/xorg.conf.
Back to top
View user's profile Send private message
cfgauss
Tux's lil' helper
Tux's lil' helper


Joined: 18 May 2005
Posts: 145

PostPosted: Wed Aug 20, 2008 12:06 am    Post subject: Reply with quote

bunder wrote:
some preliminary questions:
recent upgrade (kernel and software)?
ati or nvidia?
any services failing on boot?
maybe an emerge --info, lspci, dmesg?

I "emerge world" on weekends and it crashed both before and after last weekend's emerge. I'm using nvidia-drivers 173.14.09. dmesg doesn't list any boot failures. Is there another way to reconstruct them?

Code:
# uname -a
Linux gauss 2.6.25-gentoo-r7 #1 PREEMPT Sat Jul 26 11:11:20 CDT 2008 x86_64 AMD Athlon(tm) 64 Processor 3800+ AuthenticAMD GNU/Linux

# emerge --info
Portage 2.1.4.4 (default-linux/amd64/2006.1/desktop, gcc-4.1.2, glibc-2.6.1-r0, 2.6.25-gentoo-r7 x86_64)
=================================================================
System uname: 2.6.25-gentoo-r7 x86_64 AMD Athlon(tm) 64 Processor 3800+
Timestamp of tree: Sat, 16 Aug 2008 18:00:02 +0000
app-shells/bash:     3.2_p33
dev-java/java-config: 1.3.7, 2.1.6
dev-lang/python:     2.5.2-r6
dev-python/pycrypto: 2.0.1-r6
sys-apps/baselayout: 1.12.11.1
sys-apps/sandbox:    1.2.18.1-r2
sys-devel/autoconf:  2.13, 2.61-r2
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.1
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.23-r3
ACCEPT_KEYWORDS="amd64"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -march=k8 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.3/env /usr/kde/3.3/share/config /usr/kde/3.3/shutdown /usr/kde/3.4/env /usr/kde/3.4/share/config /usr/kde/3.4/shutdown /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/config"
CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/terminfo /etc/texmf/web2c /etc/udev/rules.d"
CXXFLAGS="-O2 -march=k8 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="distlocks fixpackages metadata-transfer sandbox sfperms strict unmerge-orphans"
GENTOO_MIRRORS="ftp://mirror.datapipe.net/gentoo"
LINGUAS="en"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X alsa also amd64 apache2 arts berkdb bitmap-fonts cairo cdr cli cracklib crypt cups dbus dri dvd dvdr eds emboss encode esd fam firefox foomaticdb fortran gdbm gif gpm gstreamer gtk gtk2 hal iconv imap ipv6 isdnlog jpeg kde kdeenablefinal ldap mad midi mikmod mp3 mpeg mudflap ncurses nls nptl nptlonly ogg opengl openmp oss pam pcre perl png ppds pppd python qt qt3 qt4 quicktime readline reflection sdl session spell spl ssl symlink tcpd tk truetype truetype-fonts type1-fonts unicode vorbis xml xorg xv zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias" CAMERAS="canon" ELIBC="glibc" INPUT_DEVICES="keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en" USERLAND="GNU" VIDEO_CARDS="nvidia nv vesa fbdev vga"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY

# lspci
00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)
00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3)
00:04.0 Multimedia audio controller: nVidia Corporation CK804 AC'97 Audio Controller (rev a2)
00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2)
00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)
00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:08.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
01:0c.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 80)
03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 15)
05:00.0 VGA compatible controller: nVidia Corporation NV43 [GeForce 6600] (rev a2)

It crashed twice today but not right at KDE boot. The first time I noted that xload was normal (virtually zero) and ksensors indicated a normal CPU temp (36C). I reverted to an older kernel, 2.6.25-r6, and then it crashed when I loaded a Firefox page with flash. I'm back to the newer 2.6.25-r7 kernel and using mozilla-firefox-bin, the 32 bit Firefox, to see if that slows down the crash rate.

Thanks for your help.
Back to top
View user's profile Send private message
cfgauss
Tux's lil' helper
Tux's lil' helper


Joined: 18 May 2005
Posts: 145

PostPosted: Wed Aug 20, 2008 12:12 am    Post subject: Reply with quote

Hu wrote:
Are you using the open source drivers for your video card or the proprietary ones from your card vendor? While gathering the information bunder wanted, please also post the output of nl /etc/X11/xorg.conf.

Proprietary nvidia-drivers 173.14.09.

Code:
$ confcat /etc/X11/xorg.conf | nl
     1  Section "Module"
     2      Load        "dbe"
     3      SubSection  "extmod"
     4        Option    "omit xfree86-dga"
     5      EndSubSection
     6      Load        "type1"
     7      Load        "freetype"
     8      Load       "glx"
     9  EndSection
    10  Section "Files"
    11      FontPath   "/usr/share/fonts/misc:unscaled"
    12      FontPath   "/usr/share/fonts/Type1"
    13      FontPath   "/usr/share/fonts/TTF"
    14      FontPath   "/usr/share/fonts/corefonts"
    15      FontPath   "/usr/share/fonts/freefont"
    16      FontPath   "/usr/share/fonts/sharefonts"
    17      FontPath   "/usr/share/fonts/terminus"
    18      FontPath   "/usr/share/fonts/ttf-bitstream-vera"
    19      FontPath   "/usr/share/fonts/unifont"
    20      FontPath   "/usr/share/fonts/100dpi/"
    21      FontPath   "/usr/share/fonts/75dpi/"
    22      FontPath   "/usr/share/fonts/artwiz/"
    23  EndSection
    24  Section "ServerFlags"
    25  EndSection
    26  Section "InputDevice"
    27      Identifier  "Keyboard1"
    28      Driver      "kbd"
    29      Option "AutoRepeat" "500 30"
    30      Option "XkbModel"   "pc104"
    31      Option "XkbLayout"  "us"
    32  EndSection
    33  Section "InputDevice"
    34      Identifier  "Mouse1"
    35      Driver      "mouse"
    36      Option "Protocol"    "ExplorerPS/2"
    37      Option "Device"      "/dev/input/mice"
    38      Option "Buttons"     "7"
    39      Option "ZAxisMapping" "4 5"
    40  EndSection
    41  Section "Monitor"
    42      Identifier  "ViewSonic VP201B"
    43      HorizSync   30-92
    44      VertRefresh 50-85
    45  EndSection
    46  Section "Device"
    47      Identifier  "Standard VGA"
    48      VendorName  "Unknown"
    49      BoardName   "Unknown"
    50      Driver     "vga"
    51  EndSection
    52  Section "Device"
    53      Identifier  "Gigabyte Geforce 6600"
    54      Driver      "nvidia"
    55      VideoRam    131072
    56      Option      "NvAGP" "0"
    57      Option      "NoLogo" "true"
    58      Option      "NoRenderExtension" "false"
    59      Option      "RenderAccel" "true"
    60      Option      "AllowGLXWithComposit" "true"
    61      Option      "CursorShadow" "true"
    62      Option      "CursorShadowAlpha" "32"
    63      Option      "BackingStore" "true"
    64      Option      "UseEdidDpi" "FALSE"
    65      Option      "DPI" "96 x 96"
    66  EndSection
    67  Section "Screen"
    68      Identifier  "Screen 1"
    69      Device      "Gigabyte Geforce 6600"
    70      Monitor     "ViewSonic VP201B"
    71      DefaultDepth 24
    72      Subsection "Display"
    73          Depth       8
    74          Modes       "1600x1200" "1280x1024" "1024x768" "800x600" "640x480"
    75          ViewPort    0 0
    76      EndSubsection
    77      Subsection "Display"
    78          Depth       16
    79          Modes       "1600x1200" "1280x1024" "1024x768" "800x600" "640x480"
    80          ViewPort    0 0
    81      EndSubsection
    82      Subsection "Display"
    83          Depth       24
    84          Modes       "1600x1200" "1280x1024" "1024x768" "800x600" "640x480"
    85          ViewPort    0 0
    86      EndSubsection
    87  EndSection
    88  Section "ServerLayout"
    89      Identifier  "Simple Layout"
    90      Screen "Screen 1"
    91      InputDevice "Mouse1" "CorePointer"
    92      InputDevice "Keyboard1" "CoreKeyboard"
    93  EndSection

Thanks for your help.
Back to top
View user's profile Send private message
DirtyHairy
Guru
Guru


Joined: 03 Jul 2006
Posts: 340
Location: Würzburg, Deutschland

PostPosted: Wed Aug 20, 2008 3:46 am    Post subject: Reply with quote

I wouldn't start searching for the cause in userland (à la firefox-bin), userland code that doesn't access hardware can't cause deadlocks without kernelspace bugs, and unless you also upgraded your kernel, I wouldn't suspect this to be the cause either. I more suspect the cause is related to X and the nvidia drivers. Try with the opensource nv or vesa drivers and see if the crashes disappear. Do you use any other external (not in-tree) kernel modules? What's the output of lsmod?
Back to top
View user's profile Send private message
cfgauss
Tux's lil' helper
Tux's lil' helper


Joined: 18 May 2005
Posts: 145

PostPosted: Thu Aug 21, 2008 9:36 am    Post subject: Reply with quote

DirtyHairy wrote:
I wouldn't start searching for the cause in userland (à la firefox-bin), userland code that doesn't access hardware can't cause deadlocks without kernelspace bugs, and unless you also upgraded your kernel, I wouldn't suspect this to be the cause either. I more suspect the cause is related to X and the nvidia drivers. Try with the opensource nv or vesa drivers and see if the crashes disappear. Do you use any other external (not in-tree) kernel modules? What's the output of lsmod?

Code:
# module-rebuild list
** Packages which I will emerge are:
        =media-video/gspcav1-20071224
        =app-emulation/vmware-modules-1.0.0.20
        =app-emulation/virtualbox-modules-1.5.6
        =x11-drivers/nvidia-drivers-173.14.09

# lsmod
Module                  Size  Used by
vmnet                  40448  3
vmmon                 989484  0
vmblock                13056  3
ip6table_filter         3072  1
ip6_tables             18448  1 ip6table_filter
iptable_raw             2688  0
xt_comment              2048  0
xt_owner                3136  0
xt_iprange              2816  0
xt_multiport            3584  8
ipt_ULOG                9160  0
ipt_TTL                 2304  0
ipt_ttl                 1984  0
ipt_REJECT              3776  4
ipt_recent              9056  0
ipt_MASQUERADE          3328  1
ipt_LOG                 6336  6
ipt_ECN                 2944  0
ipt_ecn                 2304  0
ipt_ah                  1984  0
ipt_addrtype            2944  0
nf_nat_sip              4480  0
nf_nat_irc              2624  0
nf_nat_h323             7488  0
nf_nat_ftp              3456  0
nf_conntrack_sip        9236  1 nf_nat_sip
nf_conntrack_proto_sctp     8012  0
nf_conntrack_netbios_ns     3072  0
nf_conntrack_irc        6432  1 nf_nat_irc
nf_conntrack_h323      54384  1 nf_nat_h323
nf_conntrack_ftp        8424  1 nf_nat_ftp
xt_tcpmss               2496  0
xt_pkttype              2176  4
xt_NFQUEUE              2176  0
xt_NFLOG                2240  0
xt_MARK                 3456  0
xt_mark                 2816  0
xt_mac                  2112  0
xt_limit                3136  0
xt_length               2176  0
xt_helper               2688  0
xt_dccp                 3464  0
xt_CLASSIFY             2048  0
xt_tcpudp               3584  22
xt_state                2496  15
iptable_nat             6096  1
nf_nat                 18256  6 ipt_MASQUERADE,nf_nat_sip,nf_nat_irc,nf_nat_h323,nf_nat_ftp,iptable_nat
nf_conntrack_ipv4      14544  18 iptable_nat,nf_nat
nf_conntrack           56256  16 ipt_MASQUERADE,nf_nat_sip,nf_nat_irc,nf_nat_h323,nf_nat_ftp,nf_conntrack_sip,nf_conntrack_proto_sctp,nf_conntrack_netbios_ns,nf_conntrack_irc,nf_conntrack_h323,nf_conntrack_ftp,xt_helper,xt_state,iptable_nat,nf_nat,nf_conntrack_ipv4
iptable_mangle          3136  1
nfnetlink               4296  0
iptable_filter          3264  1
ip_tables              17104  4 iptable_raw,iptable_nat,iptable_mangle,iptable_filter
x_tables               19272  32 ip6_tables,xt_comment,xt_owner,xt_iprange,xt_multiport,ipt_ULOG,ipt_TTL,ipt_ttl,ipt_REJECT,ipt_recent,ipt_MASQUERADE,ipt_LOG,ipt_ECN,ipt_ecn,ipt_ah,ipt_addrtype,xt_tcpmss,xt_pkttype,xt_NFQUEUE,xt_NFLOG,xt_MARK,xt_mark,xt_mac,xt_limit,xt_length,xt_helper,xt_dccp,xt_CLASSIFY,xt_tcpudp,xt_state,iptable_nat,ip_tables
ip_tables              17104  4 iptable_raw,iptable_nat,iptable_mangle,iptable_filter
x_tables               19272  32 ip6_tables,xt_comment,xt_owner,xt_iprange,xt_multiport,ipt_ULOG,ipt_TTL,ipt_ttl,ipt_REJECT,ipt_recent,ipt_MASQUERADE,ipt_LOG,ipt_ECN,ipt_ecn,ipt_ah,ipt_addrtype,xt_tcpmss,xt_pkttype,xt_NFQUEUE,xt_NFLOG,xt_MARK,xt_mark,xt_mac,xt_limit,xt_length,xt_helper,xt_dccp,xt_CLASSIFY,xt_tcpudp,xt_state,iptable_nat,ip_tables
w83627hf               30620  0
hwmon_vid               3520  1 w83627hf
snd_pcm_oss            39584  0
snd_mixer_oss          16576  1 snd_pcm_oss
snd_seq_dummy           3332  0
snd_seq_oss            32640  0
snd_seq_midi_event      7680  1 snd_seq_oss
snd_seq                54464  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
snd_seq_device          7252  3 snd_seq_dummy,snd_seq_oss,snd_seq
nvidia               8104592  0
snd_intel8x0           33704  0
snd_ac97_codec        115416  1 snd_intel8x0
ac97_bus                2048  1 snd_ac97_codec
snd_pcm                74188  3 snd_pcm_oss,snd_intel8x0,snd_ac97_codec
snd_timer              22664  2 snd_seq,snd_pcm
k8temp                  5120  0
hwmon                   2976  2 w83627hf,k8temp
8139too                24128  0
forcedeth              52492  0
snd                    57224  9 snd_pcm_oss,snd_mixer_oss,snd_seq_oss,snd_seq,snd_seq_device,snd_intel8x0,snd_ac97_codec,snd_pcm,snd_timer
snd_page_alloc          8656  2 snd_intel8x0,snd_pcm
ohci1394               31156  0
i2c_nforce2             6976  0
i2c_core               21912  2 nvidia,i2c_nforce2

It froze again today (it now freezes at least once a day) so I switched to nv. (For some reason, the fonts with nv are horrendous but this should tell me whether or not nvidia-drivers is the problem.)

Thanks for your help.
Back to top
View user's profile Send private message
cfgauss
Tux's lil' helper
Tux's lil' helper


Joined: 18 May 2005
Posts: 145

PostPosted: Thu Aug 21, 2008 11:24 am    Post subject: Reply with quote

Quote:
It froze again today (it now freezes at least once a day) so I switched to nv. (For some reason, the fonts with nv are horrendous but this should tell me whether or not nvidia-drivers is the problem.)

In about an hour, it froze with nv, so it's not that. Are there other things I can try to exclude as possible causes?

Thanks.
Back to top
View user's profile Send private message
cfgauss
Tux's lil' helper
Tux's lil' helper


Joined: 18 May 2005
Posts: 145

PostPosted: Sun Aug 24, 2008 10:37 am    Post subject: Reply with quote

I left town for two days so I booted the box in single-user mode and just started the networking subsystem, net.eth0, net.eth1, sshd, and shorewall, so that it could act as gateway for other boxes at home. I.e. the box just had a console interface, no GUI.

It crashed while I was gone. I'm running memtest86 now to see if it reports memory errors.

I'm at a loss how to proceed from here. Do I need to run mprime to test memory? I have Windows XP on another partition. Do I run that for a while to see if it crashes? If it does, I presume there's some hardware failure but how do I determine which hardware needs to be replaced?

I don't have very much experience with debugging this kind of failure and I rely on this box to do my work. Any pointers on how to proceed will be gratefully appreciated.

Thanks.
Back to top
View user's profile Send private message
DirtyHairy
Guru
Guru


Joined: 03 Jul 2006
Posts: 340
Location: Würzburg, Deutschland

PostPosted: Sun Aug 24, 2008 11:14 am    Post subject: Reply with quote

Well, I'm no expert here either, but even if Windows run fine, it can still be a hardware error, windows has a reputation of being more tolerant with respect to hardware glitches. You could try pulling out and swapping memory to see if it helps.
Back to top
View user's profile Send private message
cfgauss
Tux's lil' helper
Tux's lil' helper


Joined: 18 May 2005
Posts: 145

PostPosted: Sat Aug 30, 2008 10:11 am    Post subject: Reply with quote

DirtyHairy wrote:
Well, I'm no expert here either, but even if Windows run fine, it can still be a hardware error, windows has a reputation of being more tolerant with respect to hardware glitches. You could try pulling out and swapping memory to see if it helps.

memtest86 ran without errors for 8 hours. Is that a sufficient test of memory or would I need to swap memory sticks (if I can find some others) as you suggest?
Back to top
View user's profile Send private message
cfgauss
Tux's lil' helper
Tux's lil' helper


Joined: 18 May 2005
Posts: 145

PostPosted: Mon Sep 01, 2008 11:20 am    Post subject: Reply with quote

I ran Windows XP for 12 hours. It then rebooted itself into Gentoo (grub's default) while I was away from the box for 30 mins. and immediately froze at the kdm login window. Since the rebooting seemed a fluke to me, I restarted XP and it ran continuously for 34 hours so I assumed that my hardware is OK and rebooted Gentoo in single-user mode with no services running.

Is it correct to assume that 34 lockup-free hours under XP "proves" the hardware's OK? I have a spare box with almost the same hardware and could swap hardware.

If so, can I conclude that it's a faulty Linux driver and, if so, how do I find it and get a replacement?

I'm at a loss how to proceed from here and would gratefully receive pointers.

Thanks.
Back to top
View user's profile Send private message
DirtyHairy
Guru
Guru


Joined: 03 Jul 2006
Posts: 340
Location: Würzburg, Deutschland

PostPosted: Mon Sep 01, 2008 12:49 pm    Post subject: Reply with quote

Well, as I said, I wouldn't take the fact that windows runs fine as a gurantee that the hardware is OK. There are also people that claim that memtest86 didn't find some hardware flaw for them, allthough I (luckily) can't share this experience.

If you are convinced that your hardware is fine, then you can start diagnosing the software side by removing everything you don't REALLY need from the kernel and then try to crash the machine. But I still would try swapping or replacing memory before that, seems like a better shot to me (and is done quicker, too).

P.S.: Did you say windows spontaneously rebooted? If so, then this IS a hint to faulty hardware...
Back to top
View user's profile Send private message
platojones
l33t
l33t


Joined: 23 Oct 2002
Posts: 862

PostPosted: Mon Sep 01, 2008 2:00 pm    Post subject: Reply with quote

It does sound like flakey hardware, to be honest with you. This whole thing started with a failed chipset fan (implying that the chipset (NB) overheated at some point). If you want to try something else (given that memtest86 is giving you no errors), then boot up WindowsXP (which is much more tolerant of hardware faults, until it isn't...), and get yourself a copy of Prime95 (http://www.mersenne.org/freesoft.htm) and run a 'stress test' instance for each core. I would run the large FFT test or the Blend test. This will stress Windows to Linux levels. If there is a hardware fault, it will either give you an error or could lock up your box. Either way, you know it hardware then and can start troubleshooting that.

If not, then obviously something fundamental changed in your Gentoo environment.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum