Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Portage & Programming
  • Search

Segfaults during compilation on AMD Ryzen.

Problems with emerge or ebuilds? Have a basic programming question about C, PHP, Perl, BASH or something else?
Post Reply
Advanced search
260 posts
  • Page 1 of 11
    • Jump to page:
  • 1
  • 2
  • 3
  • 4
  • 5
  • …
  • 11
  • Next
Author
Message
Kresp
Tux's lil' helper
Tux's lil' helper
Posts: 77
Joined: Mon Oct 17, 2016 3:34 am

Segfaults during compilation on AMD Ryzen.

  • Quote

Post by Kresp » Sun Apr 02, 2017 7:23 am

I often encounter segfaults during emerge builds of heavy packages like curl, hgc, llvm:

Code: Select all

Apr  2 16:55:42 wagner kernel: [ 2188.416231] sh[5627]: segfault at 34 ip 0000000000406215 sp 00007ffdadd984c8 error 6 in bash[400000+a8000]
Apr  2 16:57:08 wagner kernel: [ 2273.706264] sh[15390]: segfault at e ip 0000000000406215 sp 00007ffc1c9a0d78 error 6 in bash[400000+a8000]
Apr  2 17:00:16 wagner kernel: [ 2461.767997] sh[19903]: segfault at 8 ip 0000000000406215 sp 00007ffd970f79c8 error 6 in bash[400000+a8000]
Usually just trying again is enough to finish it, even though sometimes it takes few retries.

I did memtest on all 4 RAM sticks just few days ago, before gentoo installation, so memory should be fine. CPU is not overclocked, RAM runs on 2133, nothing too crazy.
I don't really stress the system yet, CPU heatsink is always cold. I can not watch temps yet, since linux does not yet support new AMD R7 sensors, but according to BIOS, CPU fans never go above about 300+ RPM, with idle temp of about 44 C.

Gcc is 4.9.4, kernel - 4.10.6. CPU - AMD R7 1800X, motherboard - MSI X370 Titanium, BIOS 1.30 stable.

Any tips on where I should start looking?
Think I'm going to try running sysbench for some time and see if it crashes for starters, but I don't think this is hardware related.

Cpu flags set, from cpuinfo2cpuflags-x86:

Code: Select all

CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3"
Last edited by Kresp on Mon Apr 03, 2017 4:00 am, edited 1 time in total.
Top
Logicien
Veteran
Veteran
User avatar
Posts: 1555
Joined: Fri Sep 16, 2005 8:04 am
Location: Montréal

  • Quote

Post by Logicien » Sun Apr 02, 2017 7:36 am

How do you optimise Gcc compile time in /etc/portage/make.conf?
Paul
Top
Kresp
Tux's lil' helper
Tux's lil' helper
Posts: 77
Joined: Mon Oct 17, 2016 3:34 am

  • Quote

Post by Kresp » Sun Apr 02, 2017 7:40 am

Logicien wrote:How do you optimise Gcc compile time in /etc/portage/make.conf?
I'm not sure what you mean. Are you talking about MAKEOPTS?
I'll just post full make.conf for completeness:

Code: Select all

# These settings were set by the catalyst build script that automatically
# built this stage.
# Please consult /usr/share/portage/config/make.conf.example for a more
# detailed example.
CFLAGS="-march=native -O2 -pipe"
CXXFLAGS="${CFLAGS}"
MAKEOPTS="-j16"
GRUB_PLATFORMS="efi-64"
VIDEO_CARDS="nouveau"
# WARNING: Changing your CHOST is not something that should be done lightly.
# Please consult http://www.gentoo.org/doc/en/change-chost.xml before changing.
CHOST="x86_64-pc-linux-gnu"
# These are the USE and USE_EXPAND flags that were used for
# buidling in addition to what is provided by the profile.
USE="X aac alsa asm bash-completion cli crypt cups emacs encode exif fbcon ffmpeg flac fontconfig gif git gnome-keyring gtk idn -ieee1394 imap ipv6 -java javascript jit jpeg lame lm_sensors lzma  mad matroska mime mng modules mozilla mp3 mp4 mpeg multilib ncurses offensive ogg opengl openmp pdf png policykit posix -pulseaudio python quicktime raw rdp readline rss samba scanner smp sockets socks5 sound sqlite ssl svg -systemd theora threads truetype unicode usb vdpau vorbis wav wavpack x264 xattr xml xvid zlib"
CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3"
PORTDIR="/usr/portage"
DISTDIR="${PORTDIR}/distfiles"
PKGDIR="${PORTDIR}/packages"
Top
Logicien
Veteran
Veteran
User avatar
Posts: 1555
Joined: Fri Sep 16, 2005 8:04 am
Location: Montréal

  • Quote

Post by Logicien » Sun Apr 02, 2017 11:09 am

Kresp wrote:#CFLAGS="-march=native -O2 -pipe"
#CXXFLAGS="${CFLAGS}"
#MAKEOPTS="-j16"
I would revert those variables to their defaults by putting them in remarks. I am not an expert of Gcc and make but -j16 is very excessive. It is known that too agressive optimisations lead to errors during compilation.
Paul
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56076
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sun Apr 02, 2017 11:31 am

Kresp,

Segfaults during build normally indicate a hardware issue.

However, with such an old gcc on such new hardware, I would not rule out other things.
gcc-4.9.4 does not understand -march=native for Ryzen. That needs gcc-6.3 which is still hard masked in Gentoo.
I'm not suggesting that you upgrade to that. The update is not trivial and gcc-6.3 does not work for everything yet.

That the problem is intermittent points to hardware.

Is your BIOS the latest available version?
There have been a rash of BIOS updates since Rzyen was released.
Update your BIOS as a first step, if there in a newer one for your motherboard.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
Kresp
Tux's lil' helper
Tux's lil' helper
Posts: 77
Joined: Mon Oct 17, 2016 3:34 am

  • Quote

Post by Kresp » Sun Apr 02, 2017 12:51 pm

NeddySeagoon wrote: Is your BIOS the latest available version?
It is latest stable, 1.30. There was 1.41 beta, but it was pulled due to some issues - apparently some people had motherboard bricking due to applying it.

I ran sysbench for cpu with 16 threads for three hours, everything was fine. Heatsink got warm, but barely - about 35-37 C I'd say, so probably not an overheating issue.



I upgraded to gcc 5.4.0-r3. revdep-rebuild ended up rebuilding 60 packages, including few heavy ones like thunderbird. Also installed emacs, gdb and openmw with dependencies - not a single segfault so far.
I'll continue watching emerges closely for the next few days, but this upgrade to gcc5 seems to have fixed the problem.

I've read in Ryzen thread that with this CPU gcc6 is desirable, but decided against using it, since there is a slew of open tickets for in on bug tracker yet.
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56076
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sun Apr 02, 2017 2:20 pm

Kresp,

The rebuilds between gcc-4.x and gcc-5.x are due to the C++ ABI change.
Its not required every gcc major version update.
gcc-5.x to gcc-6.x is painless except for packages that have problems with gcc-6.

You are quite right to take things slowly.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
limn
l33t
l33t
Posts: 997
Joined: Fri May 13, 2005 8:08 pm

  • Quote

Post by limn » Sun Apr 02, 2017 3:59 pm

Those are not compiler faults, at least not directly. It's your shell the kernel is complaining about.
Top
c1pherx
n00b
n00b
Posts: 7
Joined: Sun Apr 02, 2017 4:45 pm

  • Quote

Post by c1pherx » Sun Apr 02, 2017 4:50 pm

I'm running a Ryzen 1700X with the ASRock Taichi and 32GB of RAM. I've tried GCC 4.9.4, 5.4.0, and 6.3.0 and I've seen some sporadic segmentation faults on all three. Also an up-to-date BIOS, Memtest returns no issues, and stress doesn't seem to cause issues. So far I haven't been able to isolate exactly what is causing it, but I don't think it's hardware. Some things I do know:

gcc-5.4.0 with -march=native is -march=bdver4.
gcc-6.3.0 with -march=native is -march=znver1

I know there are some significant differences between Bulldozer and Zen, but 6.3.0 has also produces the occasional segfault despite the newer -march. Currently re-emerging all of world with 6.3.0 to see if that helps matters.
Last edited by c1pherx on Sun Apr 02, 2017 9:37 pm, edited 1 time in total.
Top
toralf
Developer
Developer
User avatar
Posts: 3944
Joined: Sun Feb 01, 2004 2:58 pm
Location: Hamburg
Contact:
Contact toralf
Website

Re: Segfaults during compilation.

  • Quote

Post by toralf » Sun Apr 02, 2017 4:53 pm

Kresp wrote:I often encounter segfaults during emerge builds of heavy packages like curl, hgc, llvm:

Code: Select all

Apr  2 16:55:42 wagner kernel: [ 2188.416231] sh[5627]: segfault at 34 ip 0000000000406215 sp 00007ffdadd984c8 error 6 in bash[400000+a8000]
Apr  2 16:57:08 wagner kernel: [ 2273.706264] sh[15390]: segfault at e ip 0000000000406215 sp 00007ffc1c9a0d78 error 6 in bash[400000+a8000]
Apr  2 17:00:16 wagner kernel: [ 2461.767997] sh[19903]: segfault at 8 ip 0000000000406215 sp 00007ffd970f79c8 error 6 in bash[400000+a8000]
Usually just trying again is enough to finish it, even though sometimes it takes few retries.
I'd change the parallel make jobs from -j16 to -j8 and would check whether the compile issues go away or not.
Top
Jaglover
Watchman
Watchman
User avatar
Posts: 8291
Joined: Sun May 29, 2005 1:57 am
Location: Saint Amant, Acadiana

  • Quote

Post by Jaglover » Sun Apr 02, 2017 5:16 pm

Memtest results are not conclusive, it can run for days without error and the memory can still be faulty. Only if it tells you the RAM is bad you can believe it.
My Gentoo installation notes.
Please learn how to denote units correctly!
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56076
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sun Apr 02, 2017 5:35 pm

Jaglover,

Its worse than that.

Only if memtest tells you the RAM is bad at the same address on several cycles, is there a good chance its the RAM.
It can also be the memory controller (in the processor on Ryzen) or the local voltage regulator (on the motherboard) for the RAM.

Errors at random addresses are probably not RAM.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
limn
l33t
l33t
Posts: 997
Joined: Fri May 13, 2005 8:08 pm

  • Quote

Post by limn » Sun Apr 02, 2017 5:38 pm

FMA3 instruction problem?
It is described as intermittent, but also as not affecting Linux.
Top
Kresp
Tux's lil' helper
Tux's lil' helper
Posts: 77
Joined: Mon Oct 17, 2016 3:34 am

  • Quote

Post by Kresp » Mon Apr 03, 2017 10:41 am

Well, gcc5 was not a panacea - segfaults still happen.

I removed fma3 flag to check if this is related to that CPU bug, but sudo emerge --ask --newuse --update @world did not rebuild anything.


Will now try disabling SMT in UEFI and changing MAKEOPTS to 8. SMT/HT is marketing gimmick anyway.
Top
limn
l33t
l33t
Posts: 997
Joined: Fri May 13, 2005 8:08 pm

  • Quote

Post by limn » Mon Apr 03, 2017 11:15 am

emerge --newuse --update does not consider changes to CPU_FLAGS_X86.

Recompile a package that failed an arbitrary number of times with the cpu flag until you have at least one failure. Then remove flag and compile it at least as many times.

Even then you may not know until you apply the FMA3 fix.
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56076
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Mon Apr 03, 2017 2:23 pm

Kresp, limn,

CPU_FLAGS_X86 applies only to hand optimised code segments where a package advertises that such speedups are available for user selection.
They do not affect the code emitted by gcc.

To stop gcc using FMA3, you need to find the name of the option add add it to cflags.

Code: Select all

 -mno-fma
looks promising.
You cannot tell where gcc has used FMA3, if anywhere, so to be sure its not used, you need to do

Code: Select all

emerge -e @world --with-bdeps=y
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
limn
l33t
l33t
Posts: 997
Joined: Fri May 13, 2005 8:08 pm

  • Quote

Post by limn » Mon Apr 03, 2017 4:47 pm

Neddy,

Are you saying that CPU_FLAGS_X86 activate optimizations in the binary at run time?
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56076
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Mon Apr 03, 2017 5:53 pm

limn,

No. In the source at build time. They are exactly like USE flags, which is what they were at one time.

CPU_FLAGS_X86="mmx" will include sections of optional code in the source that have been hand optimised to make use of the mmx instruction set.
CFLAGS="-mmmx" (is that right?) allows gcc to emit mmx instructions in the course of any build.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
liewyec
n00b
n00b
Posts: 9
Joined: Mon Apr 03, 2017 5:53 pm

  • Quote

Post by liewyec » Mon Apr 03, 2017 6:08 pm

i got the same problem. I have ryzen 1800x, asus prime x370 pro. I have only two sticks of ram, memtest also no errors.

I can sometimes compile entire chromium with no errors and sometimes it crashes multiple times in a row compiling just a few packages and it is not even running 16 threads, It is really strange. I tried disable optimization but it didn't help.

In the end i wrote a script that will restart emerge few times if it fails.
Top
c1pherx
n00b
n00b
Posts: 7
Joined: Sun Apr 02, 2017 4:45 pm

  • Quote

Post by c1pherx » Tue Apr 04, 2017 5:52 pm

Just a quick update.

I switched to GCC-6.3.0 and the segfaults continued.

Then I switched from 4x8GB of CMK16GX4M2B3000C15 to 4x8GB of CMK16GX4M2B3200C16. The Segfaults became slightly less frequent (but this may just have been luck). Then I switched down to 2x8GB of CMK16GX4M2B3200C16 and OC'd it to 3200MHz 16-15-15-15-36 @ 1.35V and I haven't seen a Segfault since. I've compiled GHC, Chromium, and Firefox multiple times each.

I have some Ripjaws V arriving tomorrow. Going to see if I can get 32GB stable.
Top
Keepco
n00b
n00b
Posts: 5
Joined: Sun Apr 02, 2017 5:45 pm

  • Quote

Post by Keepco » Tue Apr 04, 2017 6:43 pm

Seems like I'm plagued by this as well. My Specs:

Ryzen 7 1700
MSI X370 XPower Titanium
16GB of Corsair Dominator RAM (CMD16GX4M2B3000C15), currently running @ 2133MHz

Just a few "big" packages fail for me, namely:

chromium webkit-gtk electron libreoffice

Is there any other way to get rid of this other than trying different RAM sticks?
Top
c1pherx
n00b
n00b
Posts: 7
Joined: Sun Apr 02, 2017 4:45 pm

  • Quote

Post by c1pherx » Tue Apr 04, 2017 7:50 pm

Yea. I spoke too soon. I've reduced the frequency of it happening, but it is still happening. On to the next ideas.

One pattern I'm noticing is that now it seems to be happening with builds that use libtool. This may just be a correlation, but my most recent failures were gnutls (first time that's happened) and libseccomp (first time here too). Both use Libtool.
Top
Keepco
n00b
n00b
Posts: 5
Joined: Sun Apr 02, 2017 5:45 pm

  • Quote

Post by Keepco » Tue Apr 04, 2017 8:40 pm

What irritates me is that my chromium builds always segfaults with at least the last 2 lines being exactly the same (can't recall the other ones, should've written those down somewhere..)[/code]

Code: Select all

In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/string:52:0,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/stdexcept:39,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/array:39,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/tuple:39,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/bits/stl_map.h:63,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/map:61,
                 from ../../ppapi/shared_impl/tracked_callback.h:10,
                 from ../../ppapi/thunk/ppb_output_protection_private_thunk.cc:13:
/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/bits/basic_string.h:1316:59: internal compiler error: Segmentation fault
       insert(const_iterator __p, size_type __n, _CharT __c)
                                                                                              ^

Even though it seems to be at different places during the compilation (but always during the main part of chromium, not during the sandbox etc.)
I also tried just using one of my RAM sticks (recompiled gcc after swapping them) and increased their voltage to the recommended value (1.35V, was 1.2V by default).
Top
c1pherx
n00b
n00b
Posts: 7
Joined: Sun Apr 02, 2017 4:45 pm

  • Quote

Post by c1pherx » Tue Apr 04, 2017 9:00 pm

Keepco wrote:What irritates me is that my chromium builds always segfaults with at least the last 2 lines being exactly the same (can't recall the other ones, should've written those down somewhere..)[/code]

Code: Select all

In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/string:52:0,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/stdexcept:39,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/array:39,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/tuple:39,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/bits/stl_map.h:63,
                 from /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/map:61,
                 from ../../ppapi/shared_impl/tracked_callback.h:10,
                 from ../../ppapi/thunk/ppb_output_protection_private_thunk.cc:13:
/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/bits/basic_string.h:1316:59: internal compiler error: Segmentation fault
       insert(const_iterator __p, size_type __n, _CharT __c)
                                                                                              ^

Even though it seems to be at different places during the compilation (but always during the main part of chromium, not during the sandbox etc.)
I also tried just using one of my RAM sticks (recompiled gcc after swapping them) and increased their voltage to the recommended value (1.35V, was 1.2V by default).
Did you go from GCC-4.8 right up to GCC-6.3? If yes, did you remember to re-emerge libtool and run revdep-rebuild --library 'libstdc++\.so\.5'?
Top
Keepco
n00b
n00b
Posts: 5
Joined: Sun Apr 02, 2017 5:45 pm

  • Quote

Post by Keepco » Tue Apr 04, 2017 9:01 pm

c1pherx wrote:Yea. I spoke too soon. I've reduced the frequency of it happening, but it is still happening. On to the next ideas.

One pattern I'm noticing is that now it seems to be happening with builds that use libtool. This may just be a correlation, but my most recent failures were gnutls (first time that's happened) and libseccomp (first time here too). Both use Libtool.
Can't seem to reproduce the gnutls failure, just tried recompiling it 15 times, worked every time. Guess my problems is elsewhere.

EDIT: Just re-emerged GCC without -march=native it seems like that did the job.
Last edited by Keepco on Wed Apr 05, 2017 5:35 am, edited 1 time in total.
Top
Post Reply

260 posts
  • Page 1 of 11
    • Jump to page:
  • 1
  • 2
  • 3
  • 4
  • 5
  • …
  • 11
  • Next

Return to “Portage & Programming”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy