Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
reverting to gcc 4.7.3 from 4.8.3 possible - xl segfaults
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Fri Nov 07, 2014 10:18 pm    Post subject: reverting to gcc 4.7.3 from 4.8.3 possible - xl segfaults Reply with quote

Hi guys,
I am currently running into a problem with XEN as follows: I used to use XEN for almost a year under gentoo-hardened with kernels from 3.11 onwards (currently I am on 3.15.10) and xen versions from 4.3.1 to 4.3.2 without any problems. The gcc version was always 4.7.3-r1.

Recently during a regular system update portage suggested gcc-4.8.3 and I installed that. I re-created the toolchain and re-emerged system and world - all without problems. The next day poratge suggested an upgrade to xen-4.3.3 (including xen-tools) which I did. I also re-compiled the kernel with the new tool-chain as a reboot was anyways required for the new xen version to take effect.
At reboot time I received an error about a segfault in libgcc_s.so.1 and when the system restarted, PCI passthrough to HVM domains did no longer work: Any attempt to create a HVM domU resulted in a segfault with the following message in /var/log/messages:
Code:
[41818.516206] xl[25854]: segfault at 7f2a65aadeb0 ip 00007f2a6334a624 sp 00007f2a65aadeb0 error 6 in libgcc_s.so.1[7f2a6333c000+16000]
A bt under gdb allegedly also did not reveal any information and the guys on the XEN mailing list could not make any sense out of it (to me gdb is beyond my current capabilities; I only recently learned that a backtrace might be useful in case of segfaults and how to create one thanks to Hu from this great forum - but that's about my current level of experience with gdb). Nevertheless I thought it might make sense to post the bt output in the hope that somebody on here is able to get something out of it:
Code:
GNU gdb (Gentoo 7.6.2 p1) 7.6.2
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
Reading symbols from /usr/sbin/xl...Reading symbols from
/usr/lib64/debug/usr/sbin/xl.debug...done.
done.
(gdb) run
Starting program: /usr/sbin/xl create pfsense -c
warning: no loadable sections found in added symbol-file system-supplied
DSO at 0x7ffff7ffa000
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Parsing config from pfsense
xc: info: VIRTUAL MEMORY ARRANGEMENT:
  Loader:        0000000000100000->00000000001c12a4
  Modules:       0000000000000000->0000000000000000
  TOTAL:         0000000000000000->000000001f800000
  ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
  4KB PAGES: 0x0000000000000200
  2MB PAGES: 0x00000000000000fb
  1GB PAGES: 0x0000000000000000
[New Thread 0x7ffff7ff5700 (LWP 13464)]
[New Thread 0x7ffff7fe6700 (LWP 13574)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fe6700 (LWP 13574)]
0x00007ffff5882b64 in ?? () from
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
(gdb) bt
#0  0x00007ffff5882b64 in ?? () from
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#1  0x00007ffff58835cc in ?? () from
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#2  0x00007ffff5883945 in ?? () from
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#3  0x00007ffff58845c6 in ?? () from
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#4  0x00007ffff588494c in _Unwind_ForcedUnwind () from
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#5  0x00007ffff731a733 in __pthread_unwind () from /lib64/libpthread.so.0
#6  0x00007ffff7311b49 in sigcancel_handler () from /lib64/libpthread.so.0
#7  <signal handler called>
#8  0x00007ffff731ae4d in read () from /lib64/libpthread.so.0
#9  0x00007ffff6b17b53 in read (__nbytes=16, __buf=0x7fffe80008d0,
__fd=18) at /usr/include/bits/unistd.h:44
#10 read_all (fd=18, data=data@entry=0x7fffe80008d0, len=len@entry=16,
nonblocking=nonblocking@entry=0) at xs.c:374
#11 0x00007ffff6b17c94 in read_message (h=h@entry=0x555555785670,
nonblocking=nonblocking@entry=0) at xs.c:1139
#12 0x00007ffff6b18626 in read_thread (arg=0x555555785670) at xs.c:1211
#13 0x00007ffff731332d in start_thread () from /lib64/libpthread.so.0
#14 0x00007ffff704a19d in clone () from /lib64/libc.so.6
(gdb)

I tested two older kernels just in case (both of which had been compiled with the old gcc) to no avail. I therfore thought xen-4.3.3 might be buggy and, with a lot of effort pulled the old xen-4.3.2 from the attic (it was no longer in portage due to security issues) and downgraded through the use of an overlay - only to experience that after a restart xen-4.3.2 also segfaulted.

Investigation in the logfiles revealed the followig sequence of emerges and other relevant events since the last reboot:
Code:
11.10.14 04:13: Last system reboot with working version (xen 4.3.2-r5)
                (xen-4.3.2-r5 was in use since 21.08.14)
18.10.14 22:50: Last successful creation of HVM with PCI passthrough
                (that domU run up to 26.10.14 as did another HVM)

Updates and new package installs since last reboot:
22.10.14:       app-misc/pax-utils-0.8.1 (update)
24.10.14:       dev-libs/libaio-0.3.110 (update)
                dev-libs/popt-1.16-r2 (update)
                sys-libs/libcap-ng-0.7.3 (new)
                dev-libs/libgcrypt-1.5.4-r1 (update)
                net-analyzer/tcpdump-4.6.2 (update)
25.10.14:       sys-devel/gcc-4.8.3 (update from 4.7.3-r1)
26.10.14:       app-emulation/xen-tools-4.3.3-r1 (update from 4.3.2-r5)
                app-emulation/xen-4.3.3-r1 (update from 4.3.2-r5)

26.10.14:       reboot - 1st segfault msg in syslog at shutdown time
                system reboots, can't start HVM PCI passthrough domUs
                segfault messages in syslog referring to libgcc_s.so.1
                problems since despite world/kernel/system recompile
To me the only obvious dependency that might explain the current problems of both xen-4.3.3 and now (after a re-compile) also 4.3.2 (which used to work flawlessly for many months before) is gcc (NOTE: according to a depgraph, xen also depends on libgcrypt and pax-utils - although I don't know what they are used for and what they do; they might possibly also be part of the problems I am experiencing ...). Also libgcc_s.so.1 is part of the gcc package - so that's probably another pointer towards a gcc issue.

I am happy to follow any other ideas that might come up from the combined intelligence on here, but thought that downgrading gcc to 4.7.3 would either solve the problem or rule out one more possible issue. As far as I read it is not recommended to downgrade a gcc version so I would ask for help if that's possible at all and what needs to be taken care of before doing so. I would not want to end up with an unbootable system.

Oh, and BTW the rest of the system without XEN is still rock-solid and there are no issues whatsoever.

Thanks for any input Atom2
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21635

PostPosted: Sat Nov 08, 2014 3:05 am    Post subject: Reply with quote

If you still have gcc-4.7 installed, you can switch to it as the system compiler and rebuild the Xen tools. It is unlikely that such a change will render the system unbootable. If you need to rebuild gcc, I suggest rebuilding it with symbols, so that future backtraces from libgcc_s.so have function names.
Back to top
View user's profile Send private message
cyberbat
n00b
n00b


Joined: 31 Jan 2011
Posts: 13

PostPosted: Sat Nov 08, 2014 5:56 pm    Post subject: Reply with quote

I confirm this. After updating to sys-devel/gcc-4.8.3 and rebuilding app-emulation/xen-tools-4.4.1-r3 I begin getting segfaults on near each action with xl: xl create, xl block-attach and so on. Just to be clear all this actions seems to be finished ok regardless segfaults. I'm going to get more information and file a bug in bgo then.
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sat Nov 08, 2014 9:55 pm    Post subject: Reply with quote

cyberbat wrote:
I confirm this. After updating to sys-devel/gcc-4.8.3 and rebuilding app-emulation/xen-tools-4.4.1-r3 I begin getting segfaults on near each action with xl: xl create, xl block-attach and so on. Just to be clear all this actions seems to be finished ok regardless segfaults. I'm going to get more information and file a bug in bgo then.

Have you tried starting an HVM domU with PCI passthrough devices? That does not seem to work at all: No PCI devices are being passed through which in my case renders those domUs useless. Those HVM domUs with PCI passthrough devices also appear to be in a state of paused after xl create and can only be stopped by xl destroy - which segfaults again.

In any case as strange as it sounds, I am somehow glad that I am not the only one experiencing those issues as I already thought I am getting crazy. Also see my next post with results on downgrading to gcc-4.7.3
Back to top
View user's profile Send private message
cyberbat
n00b
n00b


Joined: 31 Jan 2011
Posts: 13

PostPosted: Sat Nov 08, 2014 10:03 pm    Post subject: Reply with quote

Atom2 wrote:
cyberbat wrote:
I confirm this. After updating to sys-devel/gcc-4.8.3 and rebuilding app-emulation/xen-tools-4.4.1-r3 I begin getting segfaults on near each action with xl: xl create, xl block-attach and so on. Just to be clear all this actions seems to be finished ok regardless segfaults. I'm going to get more information and file a bug in bgo then.

In any case as strange as it sounds, I am somehow glad that I am not the only one experiencing those issues as I already thought I am getting crazy. Also see my next post with results on downgrading to gcc-4.7.3


Do you have hardened system?

I've filed a bug in bgo. Try to get the same info as me there and post it there.
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sat Nov 08, 2014 10:35 pm    Post subject: Reply with quote

Hi Hu,
first of all many thanks for your reply - it's very much appreciated. I was very much hoping for the thread to again catch your eye (like last time with the segfault and apache/ssl if you remember; that taught me a lot in providing better information like the backtrace [though interpreting its output is still soemhow a mystery to me] and I am sure I'll leave this thread with some more knowledge).
Hu wrote:
If you still have gcc-4.7 installed, you can switch to it as the system compiler and rebuild the Xen tools.
Unfortunately that did not work as I had already unmerged the old gcc compiler. No big deal though as it's still available in the tree. So I did the following steps:

1.) emerge -1 sys-devel/gcc-4.7.3 (with CFLAGES -ggdb and FEATURES splitdebug)
2.) gcc-config 1 (which refers to the old gcc-4.7.3)
3.) env-update && . /etc/profile
4.) emerge -1 xen-tools-4.3.2-r5 xen-4.3.2-r5 xen-pvgrub-4.3.2 (with CFLAGES -ggdb and FEATURES splitdebug)
5.) emerge -1 glibc (with CFLAGES -ggdb and FEATURES splitdebug)
6.) reboot the system - without problems

xl dmesg confirmed that xen-4.3.2 was running. Then the test of starting an HVM with PCI passthrough to only learn that it still segfaults.

So into debugging (please note that on 30 Oct my regular system updates brought in a new version of gdb wich now is 7.7.1 as opposed to my first post where gdb's version was 7.6.2; I hope that's not another issue to consider):
Code:
vm-host auto [508] # gdb --args xl create pfsense -c
GNU gdb (Gentoo 7.7.1 p1) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from xl...Reading symbols from /usr/lib64/debug//usr/sbin/xl.debug...done.
done.
(gdb) run
Starting program: /usr/sbin/xl create -c pfsense
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Parsing config from pfsense
xc: info: VIRTUAL MEMORY ARRANGEMENT:
  Loader:        0000000000100000->00000000001a0a28
  Modules:       0000000000000000->0000000000000000
  TOTAL:         0000000000000000->000000001f800000
  ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
  4KB PAGES: 0x0000000000000200
  2MB PAGES: 0x00000000000000fb
  1GB PAGES: 0x0000000000000000
[New Thread 0x7ffff7ff5700 (LWP 2693)]
[New Thread 0x7ffff7fe6700 (LWP 2803)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fe6700 (LWP 2803)]
0x00007ffff5895624 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
(gdb) bt
#0  0x00007ffff5895624 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#1  0x00007ffff589608c in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#2  0x00007ffff5896405 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#3  0x00007ffff5897086 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#4  0x00007ffff589740c in _Unwind_ForcedUnwind () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
#5  0x00007ffff7324773 in __GI___pthread_unwind (buf=<optimized out>) at unwind.c:129
#6  0x00007ffff731bb89 in __do_cancel () at ../nptl/pthreadP.h:280
#7  sigcancel_handler (sig=<optimized out>, si=<optimized out>, ctx=<optimized out>) at nptl-init.c:214
#8  <signal handler called>
#9  0x00007ffff7324e8d in read () at ../sysdeps/unix/syscall-template.S:81
#10 0x00007ffff6b274e7 in read (__nbytes=16, __buf=0x7fffe80008d0, __fd=14) at /usr/include/bits/unistd.h:44
#11 read_all (fd=14, data=0x7fffe80008d0, data@entry=0x20, len=len@entry=16, nonblocking=nonblocking@entry=0) at xs.c:374
#12 0x00007ffff6b27616 in read_message (h=h@entry=0x555555783640, nonblocking=nonblocking@entry=0) at xs.c:1139
#13 0x00007ffff6b27ffe in read_thread (arg=0x555555783640) at xs.c:1211
#14 0x00007ffff731d36d in start_thread (arg=0x7ffff7fe6700) at pthread_create.c:309
#15 0x00007ffff7055e0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb)


So in a nutshell back to square one despite switching back to xen-4.3.2 and gcc-4.7.3.

What, however, caught my eye in the bt is that in the backtrace (#0 to #4) there's still a reference to gcc-4.8.3 which I find bizzare given that I have switched the compiler with gcc-config and also re-compiled all relevant parts (including glibc and the xen stuff) with the old (newly emerged) gcc-4.7.3. In my view there's still some inconsistency somewhere in the system but I am confident that Hu (or somebody else) is able to explain and rectify that.

Many thanks in advance Atom2


Last edited by Atom2 on Sat Nov 08, 2014 10:46 pm; edited 1 time in total
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sat Nov 08, 2014 10:43 pm    Post subject: Reply with quote

cyberbat wrote:
Do you have hardened system?

I've filed a bug in bgo. Try to get the same info as me there and post it there.
Yes I do have a hardened system as well (but I do not and so far have never used any of PAX, grSecurity, or SELinux). My current kernel is 3.15.10-r1, but I have also tested with my previous two kernels 3.15.8 and 3.15.5-r2 which I have kept and both of which now show the same symptoms. Everything has worked with all three kernels before switching to xen-4.3.3/gcc-4.8.3.

I am pretty sure that Hu is around in due course and able to move things forward in this thread. He's very capable and knows a lot. In the meantime I'll have a look at your bug report.

Thanks Atom2
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sat Nov 08, 2014 11:09 pm    Post subject: Reply with quote

Atom2 wrote:
In the meantime I'll have a look at your bug report.
I have now added a comment to the bug report specifically adding that the issue is also present with xen-4.3.3-r1 (the latest stable version) and still happens after reverting back to gcc-4.7.3. I also mentioned the HVM / PCI-passthrough issues.

Atom2
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21635

PostPosted: Sun Nov 09, 2014 2:46 am    Post subject: Reply with quote

The continued use of libgcc_s.so from gcc-4.8 is interesting. It shows that the problem is related to using that shared object, rather than related to building the Xen code with any particular gcc version. It also suggests, as you speculated, that there is something wrong with your environment. What is the output of env as run from the shell that ran the gdb you showed?
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sun Nov 09, 2014 8:38 am    Post subject: Reply with quote

Hu wrote:
What is the output of env as run from the shell that ran the gdb you showed?
Code:
vm-host auto [513] # env
MANPATH=/usr/local/share/man:/usr/share/man:/usr/share/gcc-data/x86_64-pc-linux-gnu/4.7.3/man:/usr/share/binutils-data/x86_64-pc-linux-gnu/2.24/man
SSH_AGENT_PID=2403
SHELL=/bin/bash
TERM=screen.linux
SSH_CLIENT=192.168.1.69 52530 22
SSH_TTY=/dev/pts/0
USER=root
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.pdf=00;32:*.ps=00;32:*.txt=00;32:*.patch=00;32:*.diff=00;32:*.log=00;32:*.tex=00;32:*.doc=00;32:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:
MULTIOSDIRS=../lib64:../lib32
SSH_AUTH_SOCK=/tmp/ssh-hhsxZcephGcb/agent.2402
TERMCAP=SC|screen.linux|VT 100/ANSI X3.64 virtual terminal:\
        :DO=\E[%dB:LE=\E[%dD:RI=\E[%dC:UP=\E[%dA:bs:bt=\E[Z:\
        :cd=\E[J:ce=\E[K:cl=\E[H\E[J:cm=\E[%i%d;%dH:ct=\E[3g:\
        :do=^J:nd=\E[C:pt:rc=\E8:rs=\Ec:sc=\E7:st=\EH:up=\EM:\
        :le=^H:bl=^G:cr=^M:it#8:ho=\E[H:nw=\EE:ta=^I:is=\E)0:\
        :li#43:co#132:am:xn:xv:LP:sr=\EM:al=\E[L:AL=\E[%dL:\
        :cs=\E[%i%d;%dr:dl=\E[M:DL=\E[%dM:dc=\E[P:DC=\E[%dP:\
        :im=\E[4h:ei=\E[4l:mi:IC=\E[%d@:ks=\E[?1h\E=:\
        :ke=\E[?1l\E>:vi=\E[?25l:ve=\E[34h\E[?25h:vs=\E[34l:\
        :ti=\E[?1049h:te=\E[?1049l:us=\E[4m:ue=\E[24m:so=\E[3m:\
        :se=\E[23m:mb=\E[5m:md=\E[1m:mh=\E[2m:mr=\E[7m:\
        :me=\E[m:ms:\
        :Co#8:pa#64:AF=\E[3%dm:AB=\E[4%dm:op=\E[39;49m:AX:\
        :vb=\Eg:as=\E(0:ae=\E(B:\
        :ac=\140\140aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~..--++,,hhII00:\
        :k0=\E[10~:k1=\EOP:k2=\EOQ:k3=\EOR:k4=\EOS:k5=\E[15~:\
        :k6=\E[17~:k7=\E[18~:k8=\E[19~:k9=\E[20~:k;=\E[21~:\
        :F1=\E[23~:F2=\E[24~:F3=\E[25~:F4=\E[26~:F5=\E[28~:\
        :F6=\E[29~:F7=\E[31~:F8=\E[32~:F9=\E[33~:FA=\E[34~:kb=:\
        :K2=\E[G:kB=\E[Z:kh=\E[1~:@1=\E[1~:kH=\E[4~:@7=\E[4~:\
        :kN=\E[6~:kP=\E[5~:kI=\E[2~:kD=\E[3~:ku=\EOA:kd=\EOB:\
        :kr=\EOC:kl=\EOD:
PAGER=/usr/bin/less
CONFIG_PROTECT_MASK=/etc/gentoo-release /etc/sandbox.d /etc/terminfo /etc/ca-certificates.conf /etc/revdep-rebuild
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.7.3:/root/bin
MAIL=/var/mail/root
STY=2405.main
LC_COLLATE=C
PWD=/etc/xen/auto
EDITOR=/usr/bin/vi
LESSCOLOR=yes
LANG=en_US.UTF-8
HISTIGNORE=pwd:ls:ls -l:ll:man:whereis:env:top:xl top:&
HOME=/root
SHLVL=2
LESS=-R -M --shift 5
LOGNAME=root
GCC_SPECS=
WINDOW=1
SSH_CONNECTION=192.168.1.69 52530 192.168.19.2 22
LESSOPEN=|lesspipe %s
ES_BASHCOMP_DIRS=/usr/share/bash-completion/completions
INFOPATH=/usr/share/info:/usr/share/gcc-data/x86_64-pc-linux-gnu/4.7.3/info:/usr/share/binutils-data/x86_64-pc-linux-gnu/2.24/info
CONFIG_PROTECT=/usr/share/gnupg/qualified.txt
_=/usr/bin/env
OLDPWD=/root
And that was from exactly the same bash session (which is part of a screen session that I just connected to again this morning to execute env) I was running the gdb from. I can't see any reference to 4.8.3 but one to 4.7.3. Still strange to me ...

And there were not many other commands inbetween as the following command shows (NOTE: a few commands, including env, are filtered and not put into the histrory list - see the value of HISTIGNORE in the environment):
Code:
  508  gdb --args xl create -c pfsense
  509  ll /boot
  510  emerge --info
  511  ll /usr/portage/app-emulation/xen-tools/
  512  df
  513  history


Thanks again Atom2
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sun Nov 09, 2014 8:46 am    Post subject: Reply with quote

And just for the sake of completness:
Code:
vm-host auto [517] # find / -name '*libgcc_s.so*' -exec ls -l {} +
-rw-r--r-- 1 root root 375169 Nov  8 12:00 /usr/lib64/debug/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/32/libgcc_s.so.1.debug
-rw-r--r-- 1 root root 441076 Nov  8 12:00 /usr/lib64/debug/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/libgcc_s.so.1.debug
lrwxrwxrwx 1 root root     13 Nov  8 12:01 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.7.3/32/libgcc_s.so -> libgcc_s.so.1
-rw-r--r-- 1 root root 107756 Nov  8 12:00 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.7.3/32/libgcc_s.so.1
lrwxrwxrwx 1 root root     13 Nov  8 12:01 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.7.3/libgcc_s.so -> libgcc_s.so.1
-rw-r--r-- 1 root root  87880 Nov  8 12:00 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.7.3/libgcc_s.so.1
lrwxrwxrwx 1 root root     13 Nov  6 11:52 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.8.3/32/libgcc_s.so -> libgcc_s.so.1
-rw-r--r-- 1 root root 103580 Nov  6 11:52 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.8.3/32/libgcc_s.so.1
lrwxrwxrwx 1 root root     13 Nov  6 11:52 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so -> libgcc_s.so.1
-rw-r--r-- 1 root root  91872 Nov  6 11:52 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21635

PostPosted: Sun Nov 09, 2014 4:33 pm    Post subject: Reply with quote

There is no environment override, so this is probably coming from /etc/ld.so.conf.d/05gcc-x86_64-pc-linux-gnu.conf influencing /etc/ld.so.cache. You could try removing the 4.8 entries from the conf file and then rebuilding the ld.so cache, but I cannot guarantee that doing so will not break programs which were built with gcc 4.8. At this point, although it will be more trouble in the short term, I think we should focus on getting Xen to run correctly with the new gcc. To that end, you should rebuild gcc 4.8 with debugging symbols, following the same steps you used when you reinstalled gcc 4.7 with debug symbols.
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sun Nov 09, 2014 7:31 pm    Post subject: Reply with quote

Hi Hu,
many thanks for your continued support.
Hu wrote:
There is no environment override, so this is probably coming from /etc/ld.so.conf.d/05gcc-x86_64-pc-linux-gnu.conf influencing /etc/ld.so.cache.

Just for reference here's the content of my /etc/ld.so.conf.d/05gcc-x86_64-pc-linux-gnu.conf as of my last post:
Code:
vm-host ~ [505] # cat /etc/ld.so.conf.d/05gcc-x86_64-pc-linux-gnu.conf
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/32
/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3

I nevertheless decided to follow your second option and have decided to rebuild gcc-4.8.3 with symbols. These are the steps I did:
    1.) switching back to gcc-4.8.3 through gcc-config
    2.) env-update && . /etc/profile
    3.) emerge --deplcean sys-devel/gcc (resulting in an unmerge of gcc-4.7.3 to be sure no traces are left)
    4.) review /etc/ld.so.conf.d/05gcc-x86_64-pc-linux-gnu.conf - all references to gcc-4.7.3 were gone
    5.) emerge -1 sys-devel/gcc (with CFLAGS -ggdb and FEATURES splitdebug)
    6.) emerge -1 glibc (with CFLAGS -ggdb and FEATURES splitdebug)
    7.) emerge -1 xen-tools xen xen-pvgrub (I emerged xen-4.3.3 again as the segfault doesn't seem to be confined to 4.3.3-r1; also with CFLAGS -ggdb and FEATURES splitdebug)
    8.) reboot the system - no problems
xl dmesg after the reboot again confirmed that xen-4.3.3 is running.

So up to starting xl under gcc:
Code:
vm-host auto [512] # gdb --args xl create pfsense -c
GNU gdb (Gentoo 7.7.1 p1) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from xl...Reading symbols from /usr/lib64/debug//usr/sbin/xl.debug...done.
done.
(gdb) run
Starting program: /usr/sbin/xl create pfsense -c
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Parsing config from pfsense
xc: info: VIRTUAL MEMORY ARRANGEMENT:
  Loader:        0000000000100000->00000000001c10c4
  Modules:       0000000000000000->0000000000000000
  TOTAL:         0000000000000000->000000001f800000
  ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
  4KB PAGES: 0x0000000000000200
  2MB PAGES: 0x00000000000000fb
  1GB PAGES: 0x0000000000000000
[New Thread 0x7ffff7ff5700 (LWP 2489)]
[New Thread 0x7ffff7fe6700 (LWP 2601)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fe6700 (LWP 2601)]
0x00007ffff5892624 in execute_stack_op (op_ptr=0x7ffff7329b83 "w\240\001\006\020\b\002w(\020\t\002w0\020\n\002w8\020\v\003w\300",
    op_end=0x7ffff7329b87 "\020\b\002w(\020\t\002w0\020\n\002w8\020\v\003w\300", context=context@entry=0x7ffff7fe5190,
    initial=initial@entry=0) at /var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2.c:516
516     /var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2.c: No such file or directory.
(gdb) bt
#0  0x00007ffff5892624 in execute_stack_op (
    op_ptr=0x7ffff7329b83 "w\240\001\006\020\b\002w(\020\t\002w0\020\n\002w8\020\v\003w\300",
    op_end=0x7ffff7329b87 "\020\b\002w(\020\t\002w0\020\n\002w8\020\v\003w\300", context=context@entry=0x7ffff7fe5190,
    initial=initial@entry=0) at /var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2.c:516
#1  0x00007ffff589308c in uw_update_context_1 (context=context@entry=0x7ffff7fe55a0, fs=fs@entry=0x7ffff7fe52f0)
    at /var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2.c:1424
#2  0x00007ffff5893405 in uw_update_context (context=context@entry=0x7ffff7fe55a0, fs=fs@entry=0x7ffff7fe52f0)
    at /var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2.c:1506
#3  0x00007ffff5894086 in uw_advance_context (fs=0x7ffff7fe52f0, context=0x7ffff7fe55a0)
    at /var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind-dw2.c:1529
#4  _Unwind_ForcedUnwind_Phase2 (exc=exc@entry=0x7ffff7fe6d70, context=context@entry=0x7ffff7fe55a0)
    at /var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind.inc:185
#5  0x00007ffff589440c in _Unwind_ForcedUnwind (exc=0x7ffff7fe6d70, stop=stop@entry=0x7ffff73215e0 <unwind_stop>,
    stop_argument=0x7ffff7fe5d30) at /var/tmp/portage/sys-devel/gcc-4.8.3/work/gcc-4.8.3/libgcc/unwind.inc:207
#6  0x00007ffff7321773 in __GI___pthread_unwind (buf=<optimized out>) at unwind.c:129
#7  0x00007ffff7318b89 in __do_cancel () at ../nptl/pthreadP.h:280
#8  sigcancel_handler (sig=<optimized out>, si=<optimized out>, ctx=<optimized out>) at nptl-init.c:214
#9  <signal handler called>
#10 0x00007ffff7321e8d in read () at ../sysdeps/unix/syscall-template.S:81
#11 0x00007ffff6b247c3 in read (__nbytes=16, __buf=0x7fffe80008d0, __fd=14) at /usr/include/bits/unistd.h:44
#12 read_all (fd=14, data=data@entry=0x7fffe80008d0, len=len@entry=16, nonblocking=nonblocking@entry=0) at xs.c:374
#13 0x00007ffff6b24904 in read_message (h=h@entry=0x555555784280, nonblocking=nonblocking@entry=0) at xs.c:1139
#14 0x00007ffff6b25296 in read_thread (arg=0x555555784280) at xs.c:1211
#15 0x00007ffff731a36d in start_thread (arg=0x7ffff7fe6700) at pthread_create.c:309
#16 0x00007ffff7052e0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb)


This time the result looks slightly different and I hope you are able to make more sense out of it. gdb is still running so if you need more information it's easy for me to provide that.

The one thing that caught my eye is the message about a missing file named unwind-dw2.c soon after the SIGSEGV message. Is there another package that I need to emerge with symbols in order to get closer to the problem?

Thanks Atom2
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21635

PostPosted: Sun Nov 09, 2014 9:21 pm    Post subject: Reply with quote

In this context, that message is irrelevant. The sources exist in /var/tmp/portage during the build and are deleted during the cleaning step, so it is expected that gdb cannot find them. For your actual problem, you have a good backtrace, but you need help from someone who understands the unwinding code. I could try to learn it, but you are probably better off finding an existing expert. You may have success taking this report to the Xen developers or you may need to involve gcc developers, since the actual fault is in the gcc unwinding code.
Back to top
View user's profile Send private message
Atom2
Apprentice
Apprentice


Joined: 01 Aug 2011
Posts: 185

PostPosted: Sun Nov 09, 2014 11:18 pm    Post subject: Reply with quote

Thanks again Hu,
Hu wrote:
You may have success taking this report to the Xen developers or you may need to involve gcc developers, since the actual fault is in the gcc unwinding code.
I have added the latest backtrace to the bug at the gentoo bugzilla (which has been opened by cyberbat) and also sent it to the xen devel list.

I hope somebody in either of these places is able to move this forward - at the moment my system is not really useable.

In any case again many thanks for your support Atom2
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum