View previous topic :: View next topic |
Author |
Message |
endemic n00b


Joined: 06 Oct 2003 Posts: 23 Location: Dayton, OH
|
Posted: Thu Apr 20, 2006 3:01 pm Post subject: System locks (Ultra 60) |
|
|
Recently I've been experiencing issues with my Ultra 60. If the machine is left running, eventually it stops responding and even the console connection doesn't work. Rebooting and digging through /var/log/messages this is the only pertenant entry I can come up with:
Code: |
Apr 20 06:52:36 sirius Unable to handle kernel NULL pointer dereference
Apr 20 06:52:36 sirius tsk->{mm,active_mm}->context = 0000000000000427
Apr 20 06:52:36 sirius tsk->{mm,active_mm}->pgd = fffff800bf06e000
Apr 20 06:52:36 sirius \|/ ____ \|/
Apr 20 06:52:36 sirius "@'/ .. \`@"
Apr 20 06:52:36 sirius /_| \__/ |_\
Apr 20 06:52:36 sirius \__U_/
Apr 20 06:52:36 sirius devfsd(43): Oops
Apr 20 06:52:36 sirius CPU[0]: local_irq_count[0] irqs_running[0]
Apr 20 06:52:36 sirius TSTATE: 0000009980009603 TPC: 0000000000422d70 TNPC: 0000000000422d74 Y: 00000000 Not tainted
Apr 20 06:52:36 sirius g0: 0000000000000002 g1: 00000427348b4dd0 g2: 0000000000000001 g3: fffff800bed4bc40
Apr 20 06:52:36 sirius g4: fffff80000000000 g5: 0000000000000000 g6: fffff800bed48000 g7: 0000000000000000
Apr 20 06:52:36 sirius o0: 0000000000000080 o1: 0000000000000000 o2: fffff800bed4bbe0 o3: 0000000000000008
Apr 20 06:52:36 sirius o4: 0000000000000000 o5: fffff800bf392ae8 sp: fffff800bed4b0d1 ret_pc: 0000000000422d18
Apr 20 06:52:36 sirius l0: 0000000000000083 l1: 0000000000000068 l2: 0000000000000002 l3: 0000000000000000
Apr 20 06:52:36 sirius l4: 0000000000000002 l5: 0000000000010610 l6: 000000007002c568 l7: 000000007002b90c
Apr 20 06:52:36 sirius i0: fffff800bed4bba0 i1: 00000000d0162082 i2: 0000000000000083 i3: 0000000000800009
Apr 20 06:52:36 sirius i4: 00000000004c64e8 i5: 0000000000000000 i6: fffff800bed4b1a1 i7: 000000000041be20
Apr 20 06:52:36 sirius Caller[000000000041be20]
Apr 20 06:52:36 sirius Caller[0000000000410b40]
Apr 20 06:52:36 sirius Caller[000000000040864c]
Apr 20 06:52:36 sirius Caller[00000000004c8324]
Apr 20 06:52:36 sirius Caller[0000000000475a30]
Apr 20 06:52:36 sirius Caller[0000000000410eb4]
Apr 20 06:52:36 sirius Caller[00000000000155cc]
Apr 20 06:52:36 sirius Instruction DUMP: 16400019 80a52004 0248000a <e28c2000> e48c2001 a32c6008 02ca4004 a2044012 a32c7030
Apr 20 06:52:36 sirius CPU[2]: local_irq_count[0] irqs_running[0]
Apr 20 06:52:36 sirius TSTATE: 00000099f0000a06 TPC: 0000000070137838 TNPC: 000000007013783c Y: 00000000 Not tainted
Apr 20 06:52:36 sirius g0: 0000000000000000 g1: 000000000000002c g2: 0000000070066760 g3: 00000000700378f8
Apr 20 06:52:36 sirius g4: 000000007007fd2c g5: 7578000000000000 g6: 0000000000036760 g7: 000000007002d800
Apr 20 06:52:36 sirius o0: 000000007003b788 o1: 0000000000000000 o2: 0000000070030000 o3: 000000007002ca88
Apr 20 06:52:36 sirius o4: 000000007003b788 o5: 0000000000000012 sp: 00000000effffaa8 ret_pc: 00000000000050ca
Apr 20 06:52:36 sirius l0: 00000000 l1: 00000000 l2: 00000000 l3: 00000000 l4: 00000000 l5: 00000000 l6: 00000000 l7: 7016d000
Apr 20 06:52:36 sirius i0: 700a47a0 i1: effffb78 i2: effffb74 i3: 00000000 i4: 00000000 i5: 00000000 i6: effffb10 i7: 700a49dc
|
This is a dual processor machine and I have somewhat suspected that one of the processors may be dieing. I would love to hear good news about how it isn't my hardware but does anyone have any ideas what could be causing this issue? |
|
Back to top |
|
 |
chance2105 Tux's lil' helper


Joined: 10 Jun 2004 Posts: 112 Location: Norman, OK USA
|
Posted: Sat Apr 22, 2006 2:32 am Post subject: |
|
|
Ok well .. let's get some more information. I'm not exactly the greatest kernel person, but here goes nothing.
That looks like a 2.4 kernel OOPS. Is that the case?
Also, is that OOPS the last thing you see in your logs? Is there anything before or after it? IOW, does your machine die right at that point?
When it stops responding, has just X stopped responding, or are you able to SSH into it?
About your processor .. heh. heh. heh. If it's not a hardware issue of some kind, I'd be surprised. For me, 2.4 series kernels on a U60 is absolutely stable (with a few notable exceptions, things like particular USB chipsets). It'd be interesting to see the output of lspci.
chance |
|
Back to top |
|
 |
endemic n00b


Joined: 06 Oct 2003 Posts: 23 Location: Dayton, OH
|
Posted: Sat Apr 22, 2006 5:27 pm Post subject: |
|
|
Yes, it was a 2.4 kernel OOPS. It was the only thing I really saw in the log before the dump, and the machine does usually die at that point.
Also, the box completely stops responding at that point. You are unable to even use the serial console (which would normally still drop the kernel dump information even if the machine is locked).
Unfortunatly, it does appear to be one of the processors going. I have removed the suspected processor and I have not experienced this issue again. So unfortunatly its back to a single processor until I can locate another 450Mhz UltraSPARC II for a reasonable price  |
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|