Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Apache trouble
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo on Sparc
View previous topic :: View next topic  
Author Message
ba
l33t
l33t


Joined: 25 May 2003
Posts: 804

PostPosted: Wed Mar 30, 2005 11:48 pm    Post subject: Apache trouble Reply with quote

Some time ago apache started to dying at about half of https requests with bus error. I'm not sure what caused it(system update was week before, maybe it's the reason...).

In error_log
Code:

[Thu Mar 31 02:26:45 2005] [notice] Apache/2.0.52 (Gentoo/Linux) mod_ssl/2.0.52 OpenSSL/0.9.7e configured -- resuming normal operations
[Thu Mar 31 02:27:09 2005] [notice] child pid 1441 exit signal Bus error (10)
[Thu Mar 31 02:27:09 2005] [error] cgid daemon process died, restarting
[Thu Mar 31 02:27:11 2005] [notice] child pid 1640 exit signal Bus error (10)
[Thu Mar 31 02:27:22 2005] [notice] child pid 1903 exit signal Bus error (10)
[Thu Mar 31 02:30:08 2005] [notice] child pid 1646 exit signal Bus error (10)
[Thu Mar 31 02:30:09 2005] [notice] child pid 1905 exit signal Bus error (10)


strace
Code:

Process 2339 attached - interrupt to quit
fcntl64(168, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN, revents=POLLIN}], 2, -1) = 1
accept(3, {sa_family=AF_INET, sin_port=htons(59132), sin_addr=inet_addr("xxx.xxx.xxx.xxx")}, [16]) = 170
fcntl64(168, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=0, len=0}) = 0
getsockname(170, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("xxx.xxx.xxx.xxx")}, [16]) = 0
time(NULL)                              = 1112224626
brk(0)                                  = 0x5dc000
brk(0x5fe000)                           = 0x5fe000
fcntl64(170, F_GETFL)                   = 0x2 (flags O_RDWR)
fcntl64(170, F_SETFL, O_RDWR|O_NONBLOCK) = 0
time(NULL)                              = 1112224626
read(170, "\200g\1\3\0\0N\0\0\0\20\1\0\200\3\0\200\7\0\300\6\0@\2"..., 8000) = 105
time(NULL)                              = 1112224626
time(NULL)                              = 1112224626
getpid()                                = 2339
time(NULL)                              = 1112224626
getpid()                                = 2339
time([1112224626])                      = 1112224626
getpid()                                = 2339
--- SIGBUS (Bus error) @ 0 (0) ---
chdir("/usr/lib/apache2")               = 0
rt_sigaction(SIGBUS, {SIG_DFL}, {SIG_DFL}, 0x74c5bb78, 0) = 0
getpid()                                = 2339
getpid()                                = 2339
kill(2339, SIGBUS)                      = 0
sigreturn()                             = ? (mask now [QUIT ABRT KILL SYS PIPE TSTP CONT TTIN IO XCPU PROF LOST USR1 USR2])
--- SIGBUS (Bus error) @ 0 (0) ---
Process 2339 detached


backtrace from gdb
Code:

(gdb) attach 1940
Attaching to process 1940
...
(gdb) cont
Continuing.

Program received signal SIGBUS, Bus error.
0x74c9c1e8 in mallopt () from /lib/libc.so.6
(gdb) bt
#0  0x74c9c1e8 in mallopt () from /lib/libc.so.6
#1  0x74d6192c in __after_morecore_hook () from /lib/libc.so.6
#2  0x74d6192c in __after_morecore_hook () from /lib/libc.so.6
Previous frame identical to this frame (corrupt stack?)


I tryed emerge -e apache, downgrading gcc, glibc, apache, openssl with no success... Any ideas?

and sorry for my english (
Back to top
View user's profile Send private message
labrador
Guru
Guru


Joined: 04 Oct 2003
Posts: 316

PostPosted: Thu Mar 31, 2005 5:22 pm    Post subject: compare/contrast Reply with quote

Compare the details in the various crashes. Is it always exiting with
the same consistant error? If so there is something on the application
level to debug. Is it associated with certain pages/traffic on the
web server or can it happen when there are no hits on it?

If the problem happens differently every time, it is likely a hardware issue.
Check CPU heatsink for dust. I've seen overheating cause what seemed like consistant
errors emerging a certain ebuild, but on a closer look, it was slightly
different error each time and hardware was the failure point (over heating).

A can of air can sometimes fix hardware problems.

Another thing that can cause random weirdness is a busted file system.
I had a reiserfs with the 2.6 sparc kernel, and after a power outage
everything was OK, but certain things could not build. I've since
learned that reiserfs with the 2.6 kernel on sparc is not stable.

Check the file system by booting a Live CD and running fsck
against your / partition. Type fsck[double-tab] to see all of the flavours
of fsck available, and use the one that matches your file system type.
Back to top
View user's profile Send private message
ba
l33t
l33t


Joined: 25 May 2003
Posts: 804

PostPosted: Thu Mar 31, 2005 8:32 pm    Post subject: Re: compare/contrast Reply with quote

labrador wrote:
Compare the details in the various crashes. Is it always exiting with
the same consistant error?

yes

labrador wrote:
Is it associated with certain pages/traffic on the
web server or can it happen when there are no hits on it?

It is associated with https requests
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo on Sparc All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum