Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] apache https problem - need help
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sun Oct 12, 2014 9:28 am    Post subject: [SOLVED] apache https problem - need help Reply with quote

Hi guys,
I am being faced with a very strange apache behaviour which I can't make any sense of. It might be something trivial, but despite having tried to get to the grounds of it for a few days, I am now at loss and don't know where to look further:

Apache (2.2.27-r4) works whenever I use https://ipaddress (i.e. https://192.168.0.7/ for the start page and even https://192.168.0.7/icinga), but as soon as I use https://fqdn (i.e. https://www.mydomain.com or https://www.mydomain.com/icinga) there's no answer.

My first idea was that it is a DNS resolver issue, but both the forward and reverse resolution for www.mydomain.com works (from both the linux host and the Win7 box that I try to connect using firefox) and returns proper entries (except that on the linux box I get the following output:
Code:
#host www.mydomain.com
www.mydomain.com has address 192.168.0.7
Host www.mydomain.com not found: 3(NXDOMAIN)
www.mydomain.com mail is handled by 1 smtp.mydomain.com.
# host 192.168.0.7
7.0.168.192.in-addr.arpa domain name pointer www.mydomain.com.
The 3NXDOMAIN entry is the missing IPv6 AAAA entry (host according to the man page and without any option queries A, AAAA and MX entries by default) which is due to the fact that this is an IPv4 network only. The same output is also returned when I use host www (i.e. I leave out the domain name).

Further investigation with wireshark/tcpdump revealed that the client Win7 PC sends an SSL ClientHello package to the server which is never replied to thus the communication breaks down. FF therfore also says something around the lines of "Error during data transmission"). The sequence of events (after DNS name resolution) according to wireshark is as follows:
Code:
C [55734] -> S [443]: TCP SYN (Seq=0)
S [443] -> C [55734]: TCP SYN, ACK (Seq=0 Ack=1)
C [55734] -> S [443]: TCP ACK (Seq=1 Ack=1)
C [55734] -> S [443]: SSL Client Hello
S [443] -> C [55734]: TCP ACK (Seq=1 Ack=190)
S [443] -> C [55734]: TCP FIN, ACK (Seq=1 Ack=190)
C [55734] -> S [443]: TCP ACK (Seq=190 Ack=2)
C [66734] -> S [443]: TCP FIN, ACK (Seq=190, Ack=2)
S [443] -> C [55734]: TCP ACK (Seq=2 ACK=191)
If, however, I use the ip address a slightly different sequence of packages (and obviously no DNS resolution) can be seen as follows:
Code:
C [55899] -> S [443]: TCP SYN (Seq=0)
S [443] -> C [55899]: TCP SYN, ACK (Seq=0 Ack=1)
C [55899] -> S [443]: TCP ACK (Seq=1 Ack=1)
C [55899] -> S [443]: TLSv1.2 Client Hello
S [443] -> C [55899]: TCP ACK (Seq=1 Ack=191)
S [443] -> C [55899]: TLSv1.2 Server Hello (reassembled packet)
C [55899] -> S [443]: TCP ACK (Seq=191  Ack=2921)
S [443] -> C [55899]: TLSv1.2 Certificate
C [55899] -> S [443]: TCP ACK (Seq=191 Ack=5641)
C [55899] -> S [443]: TLSv1.2 Client Key Exchange. Change Cipher Spec, Encrypted Handshake Message
C [55899] -> S [443]: TLSv1.2 Application Data
S [443] -> C [55899]: TLSv1.2 Application Data
The one big obvious difference (over and above the missing DNS stuff which I have left out in the first case) seems to be SSL in the fqdn case versus TLSv1.2 in the ipaddress case - something I am unable to explain (the client is Win7 with FF, latest version 32.0.3; BTW the same problem also happens when I use lynx from the linux command line to connect by fqdn; the error message is "Alert!: Unable to make secure connection to remote host."; using the ipaddress it works but throws an expected certificate error as the cert is only linked to the hostname and not the IP address).

On the gentoo box with the apache server the error log for apache shows that the reason for not replying for the first case simply is down to the fact that apache segfqults:
Code:
[Sun Oct 12 10:34:59 2014] [notice] child pid 3386 exit signal Segmentation fault (11)
whereas in the second case there's no problem and everything works as expected. Even if the segfault were down to the difference between SSL and TLS apache in my view should not segfault in any case.

If anybody could shed some light on this and how to get to the grounds of this, I'd be very obliged.

Thanks in advance Atom2


Last edited by Atom2 on Mon Oct 20, 2014 10:01 pm; edited 3 times in total
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21604

PostPosted: Sun Oct 12, 2014 3:50 pm    Post subject: Reply with quote

The different behaviour between selecting SSL vs TLSv1.2 is interesting, but could be a client artifact or even a presentation quirk caused by how the decoder interprets the truncated versus full byte streams. You buried the lead by putting the Apache crash at the bottom. That is the most interesting aspect and the top priority. What does the backtrace show when it crashes? Are there any other errors logged around this time?
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sun Oct 12, 2014 5:23 pm    Post subject: Reply with quote

Hu,
many thanks for answering.
Hu wrote:
You buried the lead by putting the Apache crash at the bottom. That is the most interesting aspect and the top priority
Appologies if my report was not as clear as it should have been, but my intention was to also document how I was trying to figure out what was going on - and the crash was the last step that I encountered as I did not really expect anything like that: Albeit, apache still seemed to be working fine unless I used https. In hindsight however that was due to other workers being around and the main apache process immediately restarting its crashed child.
Hu wrote:
What does the backtrace show when it crashes?
I would require some help on how to arrive at a backtrace as I have never done that before. But I am happy to learn ...
I have, however, used strace on the crashing process (I can easily do that again and post the output if it helps) but that did not reveal a great deal to me. The crash happened immediately after a brk call which I couldn't make any sense of (brk seems to have to do with memory - so to be on the safe side, I doubled the memory of the VM and rebooted, but that did not change anything). There were no obvious missing files that the process tried to open or anything other suspicios that caught my eye.
Hu wrote:
Are there any other errors logged around this time?
There is nothing related to the crash or the communication that lead to the crash in any of my log-files (access-log, error.log, ssl_access_log, ssl_error_log, ssl_request_log) as apache doesn't even get as far as trying to deliver something: It already fails in the negotiation phase. Not even the IP address of the client I connected from shows up in *any* log-file at that time (error_log, which logs the crash, also doesn't contain the client's IP address). So in a nutshell nothing other than the fact that the process received a SIGSEGV and was killed is available in error.log.

Thanks again Atom2
Back to top
View user's profile Send private message
ChrisJumper
Advocate
Advocate


Joined: 12 Mar 2005
Posts: 2390
Location: Germany

PostPosted: Sun Oct 12, 2014 7:04 pm    Post subject: Reply with quote

Check your # hostname -f it seems that your FQDN is wrong. If your Server has an IP in the Internet like 7.19.168.192 and on another Card, connected to your LAN like 192.168.0.7 you have to configure something different.

First of all, your domain.com Name should be the same Address your Internet have. Your Clients in LAN without an Entry in there Host File will Ask for an IP Adress and Access your IP through the Internet-IP Network Device. You could Configure Apache on what IP it will Listen and serve different Web Pages.

If you want to serve your Website on your LAN IP too, it could be fine. But i am not sure how the SSL Certificate is used. The Error about

Code:
 but throws an expected certificate error as the cert is only linked to the hostname and not the IP address


could be resolved just to use that one Internet Address to server your service.
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sun Oct 12, 2014 8:29 pm    Post subject: Reply with quote

Hi ChrisJumper,
thanks for your reply. I don't think that what you expect to have found being wrong is actually wrong - but please see my answers below:
ChrisJumper wrote:
Check your # hostname -f it seems that your FQDN is wrong.
Code:
# hostname -f
www.mydomain.com
That command was issued from the www box - that's the box with apache installed.

ChrisJumper wrote:
If your Server has an IP in the Internet like 7.19.168.192 and on another Card, connected to your LAN like 192.168.0.7 you have to configure something different.
Thanks for pointing that out and yes, you are right here. This, however, was simply a typo in my original post (corrected by now). The host only has one network interface (plus lo) and the former has an IP of 192.168.0.7. There's no second interface involved here. Internet access (outgoing only for the moment; incoming traffic is blocke by a firewall on the router) is through a separate router at 192.168.0.1 which is available through a bridged network setup (bridged because the whole setup currently runs under XEN and the www server is a domU in XEN terms - but that shouldn't make any difference here). The router (192.168.0.1; also a XEN domU vm) also serves as the main DNS server for the whole LAN.

ChrisJumper wrote:
First of all, your domain.com Name should be the same Address your Internet have.
That is the case and domain.com only serves as a generic name in the example to hide my real domain name. My (real) domain name is valid and registerd.

ChrisJumper wrote:
Your Clients in LAN without an Entry in there Host File will Ask for an IP Adress and Access your IP through the Internet-IP Network Device.
The IP address 192.168.0.7 for the www.mydomain.com server is in the /etc/hosts file for every (virtual) gentoo box in my network and also served from the DNS server/router at 192.168.0.1. The host command however AFAIK always anyway does a query to the DNS server and does not consult any local /etc/hosts file.
The Win7 box did actually resolve the name www.mydomain.com through querying the DNS server (to be more precise: The DNS server for the Win7 box is 192.168.0.13 which serves a subdomain of mydomain.com named samba.mydomain.com for which it is authoritative; for other queries 192.168.0.13 has a forwarder configured which is 192.168.0.1, the router and main DNS server. 192.168.0.1 is authoritative for mydomain.com [with a subdelegation for samba.mydomain.com to 192.168.0.13] and also gets DNS infor for domains in the internet like google.com - but I don't think that setup makes any difference here). And resolving from the Win7 box works as expected:
Code:
C:\>nslookup 192.168.0.7
Server:  storage.samba.mydomain.com
Address:  192.168.0.13

Name:    www.mydomain.com
Address:  192.168.0.7

C:\>nslookup www.mydomain.com
Server:  storage.samba.mydomain.com
Address:  192.168.0.13

Nicht autorisierende Antwort:
Name:    www.mydomain.com
Address:  192.168.0.7
Sorry for the German language output from Win7, but, as you might guess, that's a German language version.

ChrisJumper wrote:
You could Configure Apache on what IP it will Listen and serve different Web Pages.
I assume you refer to an entry in both 00_default_ssl_vhost.conf and 00_default_vhost.conf under /etc/apache2/vhosts.d: Both contain an entry reading
Code:
Listen 192.168.0.7:PORT
where PORT is 80 for 00_default_vhost.conf and 443 for 00_default_ssl_vhost.conf.

ChrisJumper wrote:
But i am not sure how the SSL Certificate is used. The Error about
Code:
 but throws an expected certificate error as the cert is only linked to the hostname and not the IP address
could be resolved just to use that one Internet Address to server your service.
The SSL certificate stuff is set up with a (self-signed) ROOT CA which has issued a CA certificate with signing authority for the domain mydomain.com which in turn issued a certificate to www.mydomain.com. The FF browser on the Win7 box has the ROOT CA as a trusted certificate in its certificate store and the server delivers the certificate chain up to that (trusted) ROOT CA. That should work pretty well (and in fact does work very well as can be seen when I check the [rejected] certificate that is delivered through https://ipadress: The correct cain up to the trusted ROOT CA is correctly displayed in FF).

I know that I could add the IP address to circumvent the cert error for the https://ipadress query, but that's not what I want to do as I would then need to change the server cert whenever I change its (internal) IP address. Just using hostnames is much more flexible in my view.

I hope that clarifies a few bits and pieces and helps in ruling out issues you thought you had identified as possible sources of errors. If not, please ask for more specifics and I am happy to provide those.

Thanks again Atom2
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21604

PostPosted: Sun Oct 12, 2014 9:31 pm    Post subject: Reply with quote

First, you need an Apache with symbols. You need at least one of FEATURES=nostrip or FEATURES=splitdebug. You need to set -g or -ggdb in your CFLAGS. Once those are done, rebuild Apache. If you had already done those before you installed this version of Apache, then you do not need to rebuild it again. However, most people do not set those options unless they expect to be collecting backtraces or doing other development work, so you probably have not set those options.

Once Apache has symbols, then you need either to attach a debugger to the process which will crash or you need to obtain a core file when the process crashes. If you go the core file route, open the core file in gdb, then run bt. If you attach before the process crashes, run bt when gdb displays a prompt indicating that the process received SIGSEGV. Post the output of the bt command. It should show function names, file names, and possibly arguments.
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Mon Oct 13, 2014 1:20 am    Post subject: Reply with quote

Hu,
again thanks for your comprehensive explanation - that helped me understanding how gdb works and what's required to make it work. This is very helpful in case something else crops up in the future.

Short story: everything works again.

Long story: The steps I have taken are listed below, and I honestly don't get why it is working again.
1.) Did my regular update today without any change to /etc/portage/make.conf. This pulled in the following updates:
  • sys-apps/file-5.19
  • sys-apps/hwids-20141010
  • app-shells/bash-completion-2.1-r2
  • dev-libs/newt-0.52.15
  • net-libs/neon-0.30.0-r1
  • sys-apps/help2man-1.45.1
  • dev-lang/python-3.4.1 (to new slot)
  • dev-php/PEAR-Archive_Tar-1.3.11
    NOTE: I have not tested the https connection after that update again.
2.) change /etc/portage/make.conf: added "-gddb" to CFLAGS variable; added splitdebug to FEATURES variable
3.) emerge apache
4.) restart apache and try to attach gdb - only to find out that gdb is missing
5.) emerge gdb (USE flag change: -server)
6.) attach to all running apache processes and issue command "continue" within each gdb process
7.) connect from Win7 PC and ... surprise, surprise ... everyting works ... what's going on, I don't get it
8.) reset changes in /etc/portage/make.conf (i.e. revert the changes from 2 above)
9.) emerge apache
10.) restart apache
11.) connect from Win7 PC and ... everything still works

What do I make out of this? Probably there was some inconsistency somewhere in apache through another (previous) update which did not trigger a rebuild requirement for apache but actually should have?

Probably dev-libs/openssl which was updated to dev-libs/openssl-1.0.1i on Oct 6; apache was not rebuilt after that (the last emerge of apache prior to today dates back to Aug 21). But openssl at that time did not demand anything special (unless I missed that) and neither @preserved-rebuild nor revdep-rebuild (both of which together with depclean are part of my regular update script) complained about anything or found any (in)consistency issues.

So it is all a bit strange, but it works again.

Many thanks again Atom2


P.S. Unless you have any objections I suggest I'll mark the thread as solved.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21604

PostPosted: Mon Oct 13, 2014 10:34 pm    Post subject: Reply with quote

I agree with marking it solved, since there is no known way to get it back into a broken state and still have debug symbols. OpenSSL is infamous for exposing ABI details to callers, so if the OpenSSL update changed the layout of some exposed structure, programs not rebuilt to use the new definition might break when calling the new library. This might be announced by the Gentoo maintainers as an elog message if they knew it would happen, but definitely would not be reported by the automated tools that flag soname changes. If they were not aware it would happen, they would have no reason to put out such an announcement. Such a change, if that is even the cause, could easily be missed by the maintainers, since finding it would require a detailed examination of the differences between versions.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum