Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Discussion & Documentation Gentoo Chat
  • Search

Playing With Gentoo-lto (and others)

Opinions, ideas and thoughts about Gentoo. Anything and everything about Gentoo except support questions.
Post Reply
  • Print view
Advanced search
54 posts
  • 1
  • 2
  • 3
  • Next
Author
Message
duane
Apprentice
Apprentice
Posts: 193
Joined: Mon Jun 03, 2002 1:53 pm
Location: Oklahoma City
Contact:
Contact duane
Website

Playing With Gentoo-lto (and others)

  • Quote

Post by duane » Thu Feb 20, 2020 5:04 am

I'm almost done with recompiling everything on my gentoo-lto partition. I did most of it in chroot from the standard partition, and have just now booted up the new kernel and new partition. I thought I'd write up a few notes from the point of view of someone who didn't know what graphite or lto were last week (and I'm still a bit hazy).

It was surprisingly easy. Once you have layman installed, it just takes a few steps to get everything set up. It did take me a while to figure out that you have to set the graphite and lto use flags on the gcc package before emerging everything. (Yes, really.)

Out of the nearly 800 packages on my system, only two failed to compile -- stone soup and scummvm. However, I did notice that a number of packages didn't use some or any of the optimized settings. Wine doesn't appear to get much out of the overlay. I still play a few windows games, so that's a bit disappointing.

Just as a simple test, I timed building a gzipped tar out of a large directory of text files. I saw a 19% speed increase after booting to the lto partition. I plan to try some actual benchmarks once everything is in place.

Edit: I should also specify that I copied the lto partition from my regular gentoo installation, so both are running exactly the same specifications on the same hardware, with only the optimizations different. They both use the same /home. My regular partition was built with only "-march=native -O2 -pipe" in CFLAGS. The hardware is a standard Motile m142 with 8GB of RAM and a ryzen 5 3500U.
Last edited by duane on Sun Mar 01, 2020 1:33 am, edited 2 times in total.
Top
C1REX
l33t
l33t
User avatar
Posts: 788
Joined: Fri Jan 02, 2004 2:07 am
Location: Poland/UK

  • Quote

Post by C1REX » Thu Feb 20, 2020 6:18 am

Nice one :)

I'm looking forward to hear some tests results from you.
19% speed increase is massive! To get such increase just by improving hardware would cost like what? 50% more?
CLICK HERE to help move gentoo up on distrowatch.

If you like Gentoo you can thank devs here - https://www.gentoo.org/donate/
Top
Ionen
Developer
Developer
User avatar
Posts: 3013
Joined: Thu Dec 06, 2018 2:23 pm

Re: Playing With Gentoo-lto

  • Quote

Post by Ionen » Thu Feb 20, 2020 8:01 am

duane wrote:However, I did notice that a number of packages didn't use some or any of the optimized settings. Wine doesn't appear to get much out of the overlay. I still play a few windows games, so that's a bit disappointing.
Wine break easily with optimizations in obscure and hard to diagnose ways at run time, so gentoo's strip-flags() still makes sense (in this case, gentoo does provide a custom-cflags USE to override it normally). May be better off looking at optional patches/features if want a performance boost on Wine. I notably added fsync to my gentoo's wine-staging :)

Generally speaking I find it's entirely fine to override flag-o-matic though, or at least if not going too over the top with standards-breaking flags and such. append-flags() is better left alone though (still not sure why GentooLTO tries to stop append-flags, those are often critical for things to work right and now they're setting exceptions instead).

For glibc I'd at least run the test suite if doing anything strange with it though. Not that gcc optimizations mean much on glibc given all the performance-critical functions are in hand-crafted ASM.
Top
erm67
l33t
l33t
User avatar
Posts: 653
Joined: Tue Nov 01, 2005 5:31 pm
Location: EU
Contact:
Contact erm67
Website

  • Quote

Post by erm67 » Thu Feb 20, 2020 8:02 am

Be prepared to recover, just in case.

What did you enabled? full -O3 or a subset? an what about the other flags?

Did you already emerge -e world with gentoo-lto?
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Top
duane
Apprentice
Apprentice
Posts: 193
Joined: Mon Jun 03, 2002 1:53 pm
Location: Oklahoma City
Contact:
Contact duane
Website

  • Quote

Post by duane » Thu Feb 20, 2020 3:29 pm

C1REX wrote:I'm looking forward to hear some tests results from you.
19% speed increase is massive! To get such increase just by improving hardware would cost like what? 50% more?
Of course, you're the one who gave me the idea. It's definitely more pickup than I expected, but not enough to justify any serious problems, if there are any. : )

I'll have to start looking at benchmarks today. I doubt I'll use the phoronix set. I seem to remember that it requires adding a lot of software that I don't feel like dealing with.
erm67 wrote:Be prepared to recover, just in case.

What did you enabled? full -O3 or a subset? an what about the other flags?

Did you already emerge -e world with gentoo-lto?
No problems. All my important data is backed up. The complete "emerge -e" just finished this morning. So far, everything is running very well. The system doesn't feel any faster, but I didn't really expect it to. Software emerges noticeably slower, but I haven't timed it yet -- that isn't a major issue for me anyway.

I did nothing but the instructions in the gentoo-lto readme (and the layman page in the wiki). I just want to see what a newbie can do with the simplest setup. Here's my "emerge --info", in case anyone is curious.

Code: Select all

Portage 2.3.84 (python 3.6.9-final-0, default/linux/amd64/17.1, gcc-9.2.0, glibc-2.29-r7, 5.5.4-gentoo x86_64)
=================================================================
System uname: Linux-5.5.4-gentoo-x86_64-AMD_Ryzen_5_3500U_with_Radeon_Vega_Mobile_Gfx-with-gentoo-2.6
KiB Mem:     6081204 total,   3431804 free
KiB Swap:    4194300 total,   4121852 free
Timestamp of repository gentoo: Sun, 16 Feb 2020 05:00:01 +0000
Head commit of repository gentoo: 8e60f798cbdd167ba63306b1a00bc20a5ffc7fe0
sh bash 4.4_p23-r1
ld GNU ld (Gentoo 2.32 p2) 2.32.0
ccache version 3.7.7 [disabled]
app-shells/bash:          4.4_p23-r1::gentoo
dev-java/java-config:     2.2.0-r4::gentoo
dev-lang/perl:            5.30.1::gentoo
dev-lang/python:          2.7.17::gentoo, 3.6.9::mv, 3.7.5-r1::mv
dev-util/ccache:          3.7.7::gentoo
dev-util/cmake:           3.14.6::gentoo
sys-apps/baselayout:      2.6-r1::gentoo
sys-apps/openrc:          0.42.1::gentoo
sys-apps/sandbox:         2.13::gentoo
sys-devel/autoconf:       2.13-r1::gentoo, 2.69-r4::gentoo
sys-devel/automake:       1.16.1-r1::gentoo
sys-devel/binutils:       2.32-r1::gentoo
sys-devel/gcc:            9.2.0-r2::gentoo
sys-devel/gcc-config:     2.2::gentoo
sys-devel/libtool:        2.4.6-r6::gentoo
sys-devel/make:           4.2.1-r4::gentoo
sys-kernel/linux-headers: 4.19::gentoo (virtual/os-headers)
sys-libs/glibc:           2.29-r7::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    sync-rsync-extra-opts: 
    sync-rsync-verify-jobs: 1
    sync-rsync-verify-metamanifest: yes
    sync-rsync-verify-max-age: 24

duane
    location: /var/db/repos/duane
    masters: gentoo

lto-overlay
    location: /var/lib/layman/lto-overlay
    sync-type: laymansync
    sync-uri: https://github.com/InBetweenNames/gentooLTO.git
    masters: gentoo mv
    priority: 50

mv
    location: /var/lib/layman/mv
    sync-type: laymansync
    sync-uri: https://anongit.gentoo.org/git/user/mv.git
    masters: gentoo
    priority: 50

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="@FREE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -fuse-linker-plugin -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -fuse-linker-plugin -pipe"
DISTDIR="/var/cache/distfiles"
ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -fuse-linker-plugin -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -fuse-linker-plugin -pipe"
GENTOO_MIRRORS="http://gentoo.osuosl.org/ http://www.gtlib.gatech.edu/pub/gentoo http://gentoo.cs.utah.edu/"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j4"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="X acl acpi alsa amd64 berkdb bzip2 cli crypt cxx d3d9 dri fortran gdbm gif graphite iconv ipv6 jpeg libtirpc lto lua mp3 multilib ncurses nls nptl ogg openmp pam pcre png readline seccomp split-usr ssl svg tcpd tiff truetype unicode vaapi vdpau vorbis xattr zlib" ABI_X86="64 32" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt sha sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-2" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_6" QEMU_SOFTMMU_TARGETS="i386 x86_64" QEMU_USER_TARGETS="i386 x86_64" RUBY_TARGETS="ruby24 ruby25" USERLAND="GNU" VIDEO_CARDS="amdgpu radeonsi virgl" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Top
erm67
l33t
l33t
User avatar
Posts: 653
Joined: Tue Nov 01, 2005 5:31 pm
Location: EU
Contact:
Contact erm67
Website

  • Quote

Post by erm67 » Thu Feb 20, 2020 4:04 pm

CFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -fuse-linker-plugin -pipe"


A lot of stuff crashes with all that enabled unfortunately .......but once you (or the gentoo-lto team) write a workaround that disables the optimization that cause the failure in the database and you recompile the programs without it they are just stable as "slower" programs, The worse side effect is that it takes more time to emerge, if everybody reports that programX doesn't work with CFLAG Y it will be added to the list of workarounds and it will than compile cleanly with the max possible optimization for everybody:
https://github.com/InBetweenNames/gento ... ounds.conf

In practice if you are lazy and don't sync too often it is very likely that someone has reported the failures already. On the other side if you want to give a hand report failures :-)

I have no side effects, everything works fine, I just pay more attentions after emerging programs, in particular if there is a new major release.

In theory if this would become official, it could be really a competitor for ClearLinux. Just say I use gentoo and add -marh=native and pretend your system is the fastest in the word is at least naive.
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Top
C1REX
l33t
l33t
User avatar
Posts: 788
Joined: Fri Jan 02, 2004 2:07 am
Location: Poland/UK

  • Quote

Post by C1REX » Thu Feb 20, 2020 5:20 pm

So what's the difference between -march and -mtune?
I know -mtune do about the same but compiled package still works on older CPUs.
So at what cost? There must be price to pay, right?
CLICK HERE to help move gentoo up on distrowatch.

If you like Gentoo you can thank devs here - https://www.gentoo.org/donate/
Top
erm67
l33t
l33t
User avatar
Posts: 653
Joined: Tue Nov 01, 2005 5:31 pm
Location: EU
Contact:
Contact erm67
Website

  • Quote

Post by erm67 » Thu Feb 20, 2020 5:47 pm

There is no price and no gain.
According to Intel code that uses all CPU instructions and optimized for the cache size of the current CPU can be 1%-2% faster. It is not noticeable in practice; wether you use -march -mtune doesn't matter it makes little difference with current X86_64 CPUs.

Another "fake" suggestion is that using clang will produce faster code, the truth is clang enables at -O2 some optimizations that gcc enables at -O3 and enabling the same cflags in gcc produces codes exactly as fast. The sacred rules of the gcc forums only permit the use of -O2, so if you use -O2 using clang instead of gcc will enable more optimizations. I read that gcc is prably going to fix that for gcc10 moving some optimizations that are currently at -O3 to the -O2 level .... at -O3 clang and gcc produces code that is more or less as fast, there is very little difference.
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Top
erm67
l33t
l33t
User avatar
Posts: 653
Joined: Tue Nov 01, 2005 5:31 pm
Location: EU
Contact:
Contact erm67
Website

  • Quote

Post by erm67 » Thu Feb 20, 2020 6:02 pm

@duane official gentoo hasa feature similar to the workaround file of gentoo-lto that forcefully disables some CFLAGS in the ebuild without informing the user and cannot be officially disabled, I don't know if the overlay still disables it by default, but I prefer to turn the portage cflag filter off and relay on gentoo-lto and my workarounds instead. I noticed that now portage also filter lto sometimes ......
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56085
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Thu Feb 20, 2020 6:19 pm

C1REX,

-march expresses a permitted instruction set. -mtune influences the instruction ordering so that the code is tuned for that CPU

-march=x66_64 uses instructions that will run on any AMD/Intel 64 bit system.

-march=x66_64 -mtune=zen2 reorders the above instruction stream to best suit AMDs new Ryzen CPUs.
-march=zen2 -mtune=x66_64 does not make sense.
Why would you generate a zen2 instruction stream then attempt to reorder those instructions to run anywhere, when you will get illegal instruction exceptions?
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
C1REX
l33t
l33t
User avatar
Posts: 788
Joined: Fri Jan 02, 2004 2:07 am
Location: Poland/UK

  • Quote

Post by C1REX » Thu Feb 20, 2020 6:26 pm

NeddySeagoon

I ask this question because apparently Clear Linux uses -mtune instead of -march.


This is set of flags for Clear Linux if the info is correct:

Code: Select all

CFLAGS="-g -O3 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -Wl,–copy-dt-needed-entries -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -fno-semantic-interposition -ffat-lto-objects -fno-signed-zeros -fno-trapping-math -fassociative-math -Wl,-sort-common -Wl,–enable-new-dtags -mtune=skylake"
CLICK HERE to help move gentoo up on distrowatch.

If you like Gentoo you can thank devs here - https://www.gentoo.org/donate/
Top
CaptainBlood
Advocate
Advocate
User avatar
Posts: 4237
Joined: Sun Jan 24, 2010 9:38 am

  • Quote

Post by CaptainBlood » Thu Feb 20, 2020 8:16 pm

C1REX, could you please provide your source(s)?
Thks 4 ur attention, interest & support.
Top
duane
Apprentice
Apprentice
Posts: 193
Joined: Mon Jun 03, 2002 1:53 pm
Location: Oklahoma City
Contact:
Contact duane
Website

  • Quote

Post by duane » Thu Feb 20, 2020 9:32 pm

I fired up one of my graphical windows games in wine on both partitions, and not surprisingly, the difference in frame rates was negligible.

I also ran my old standby benchmark, glmark2, on both. It's easy to run on anything, since it's written in python, but it only really tests opengl 2. There was no significant difference in the scores. I'm guessing everything video-driver-related has already been tweaked.

Then I tried transcoding some video with handbrake. Again, no surprise, the time difference was not significant. x264 has probably been optimized, and it's doing most of the work.

I ran a few linux games, but none of them have frame rate displays, so I can't draw any conclusions.

I have yet to see anything crash, but it seems that (for me) the gentoo-lto overlay doesn't do much good. Everything I run that needs speed has already got about as much as it can get. Given that, I wonder if I'd get any use out of clearlinux. The review makes it sound painful to work with, and I kind of doubt it would run any faster in any useful way.

I did discover something annoying about firefox. Apparently it tracks its version by the compile date, because trying to use the same version, compiled at an earlier date, causes it to refuse to load your "new" profile. So, as Neddy said, it's easier to use two different user ids for testing.
Top
C1REX
l33t
l33t
User avatar
Posts: 788
Joined: Fri Jan 02, 2004 2:07 am
Location: Poland/UK

  • Quote

Post by C1REX » Thu Feb 20, 2020 9:53 pm

CaptainBlood

I should be able to install clear linux soon to confirm.
I'm on my mobile now so not sure where exactly I googled these flags.
CLICK HERE to help move gentoo up on distrowatch.

If you like Gentoo you can thank devs here - https://www.gentoo.org/donate/
Top
duane
Apprentice
Apprentice
Posts: 193
Joined: Mon Jun 03, 2002 1:53 pm
Location: Oklahoma City
Contact:
Contact duane
Website

  • Quote

Post by duane » Thu Feb 20, 2020 10:41 pm

I don't know how useful libc-bench is, but it's interesting. This is the difference from one of the regular partition's times divided by the lto partition's, so a positive indicates a faster result for the gentoo-lto partition.

Code: Select all

b_malloc_tiny2 (0)	2%
b_string_strstr ("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac")	-1%
b_utf8_onebyone (0)	-7%
b_pthread_uselesslock (0)	-3%
b_string_strstr ("aaaaaaaaaaaaaaaaaaaaaaaaac")	-2%
b_pthread_createjoin_serial1 (0)	-5%
b_malloc_big1 (0)	1%
b_string_strstr ("aaaaaaaaaaaaaacccccccccccc")	-2%
b_utf8_bigbuf (0)	3%
b_malloc_big2 (0)	1%
b_malloc_bubble (0)	1%
b_pthread_create_serial1 (0)	10%
b_stdio_putcgetc_unlocked (0)	-1%
b_regex_search ("(a|b|c)*d*b")	30%
b_regex_compile ("(a|b|c)*d*b")	-3%
b_string_strlen (0)	-2%
b_stdio_putcgetc (0)	4%
b_pthread_createjoin_serial2 (0)	2%
b_string_strstr ("azbycxdwevfugthsirjqkplomn")	-2%
b_malloc_thread_local (0)	0%
b_malloc_sparse (0)	1%
b_regex_search ("a{25}b")	20%
b_malloc_thread_stress (0)	-1%
b_string_strchr (0)	-1%
b_malloc_tiny1 (0)	1%
b_string_strstr ("abcdefghijklmnopqrstuvwxyz")	-2%
b_string_memset (0)	-1%
Edit: put code tags in
Last edited by duane on Sat Feb 22, 2020 8:41 pm, edited 1 time in total.
Top
duane
Apprentice
Apprentice
Posts: 193
Joined: Mon Jun 03, 2002 1:53 pm
Location: Oklahoma City
Contact:
Contact duane
Website

  • Quote

Post by duane » Thu Feb 20, 2020 10:46 pm

CaptainBlood wrote:C1REX, could you please provide your source(s)?
Thks 4 ur attention, interest & support.
Here's an article with a lot of the settings on the first page:

https://www.phoronix.com/scan.php?page= ... ptop&num=1
Top
C1REX
l33t
l33t
User avatar
Posts: 788
Joined: Fri Jan 02, 2004 2:07 am
Location: Poland/UK

  • Quote

Post by C1REX » Thu Feb 20, 2020 11:42 pm

Hello from Clear Linux :)
Here is full vanilla profile file


Code: Select all

# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))
# and Bourne compatible shells (bash(1), ksh(1), ash(1), ...).

PATH="/usr/bin"
if grep -q "flags.*:.* avx512bw" /proc/cpuinfo; then
	PATH="/usr/bin/haswell/avx512_1:/usr/bin/haswell:/usr/bin"
elif grep -q "flags.*:.* fma .* avx2" /proc/cpuinfo; then
	PATH="/usr/bin/haswell:/usr/bin"
fi
PATH="/usr/local/bin:$PATH:/opt/3rd-party/bin"

if [[ -x "/bin/vi" ]]; then 
	EDITOR="/usr/bin/vi"			# needed for packages like cron
else
	EDITOR="/usr/bin/nano"			# needed for packages like cron
fi
test -z "$TERM" && TERM="xterm"	# Basic terminal capab. For screen etc.

# Ensure the interactive terminal has rows and columns
if [ -n "$PS1" ] && tty > /dev/null; then # Only for interactive shells
  tty_rows=$(stty size | cut -d' ' -f1)
  tty_columns=$(stty size | cut -d' ' -f2)
  if [ $((${tty_rows} + 0)) -le 0 -o $((${tty_columns} + 0)) -le 0 ]; then
    [ -x /usr/bin/setterm ] &&  setterm --resize
    # fail safe if size still not set
    tty_rows=$(stty size | cut -d' ' -f1)
    tty_columns=$(stty size | cut -d' ' -f2)
    if [ $((${tty_rows} + 0)) -le 0 -o $((${tty_columns} + 0)) -le 0 ]; then
      stty rows 24 columns 80
    fi
  fi
  unset tty_rows tty_columns
fi

if [ ! -e /etc/localtime ]; then
	TZ="UTC"		# Time Zone. Look at http://theory.uwinnipeg.ca/gnu/glibc/libc_303.html 
				# for an explanation of how to set this to your local timezone.
	export TZ
fi

if [ -z "$PS1" ]; then
# works for bash and ash (no other shells known to be in use here)
   PS1='\u@\h:\w\$ '
fi

CFLAGS="-g -O3 -feliminate-unused-debug-types  -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -m64  -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -fno-semantic-interposition -ffat-lto-objects  -fno-trapping-math -Wl,-sort-common -Wl,--enable-new-dtags -mtune=skylake  -Wa,-mbranches-within-32B-boundaries"
FFLAGS="-g -O3 -feliminate-unused-debug-types  -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -malign-data=abi -fno-semantic-interposition -ftree-vectorize  -ftree-loop-vectorize -Wl,--enable-new-dtags  -Wa,-mbranches-within-32B-boundaries "
CFFLAGS="-g -O3 -feliminate-unused-debug-types  -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -m64  -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -malign-data=abi -fno-semantic-interposition -ftree-vectorize  -ftree-loop-vectorize  -Wl,-sort-common -Wl,--enable-new-dtags "
CXXFLAGS="$CFLAGS -fvisibility-inlines-hidden -Wl,--enable-new-dtags "
export AR=gcc-ar
export RANLIB=gcc-ranlib
export NM=gcc-nm
export LA_VERSION="OpenBLAS"
export LA_LIBS=/usr/lib64/libopenblas.so.0
export LA_INCLUDE=/usr/include
export LA_PATH=/usr/lib64/
export MPI_CC=/usr/bin/mpicc
export MPI_LIBS=/usr/lib64/libmpi.so
export MPI_INCLUDE=/usr/include/
export MPI_PATH=/usr/lib64/
export MPI_VERSION=3.2
export THEANO_FLAGS='floatX=float32,openmp=true,gcc.cxxflags="-ftree-vectorize -mavx"'
export CC=gcc
export CXX=g++
export PYTHONIOENCODING=utf-8:surrogateescape
export MESA_GLSL_CACHE_DISABLE=0
export GTK_IM_MODULE="ibus"

if [ -d /usr/share/defaults/etc/profile.d ]; then
  for i in /usr/share/defaults/etc/profile.d/* ; do
    . $i
  done
  unset i
fi
if [ -d /etc/profile.d ]; then
  for i in /etc/profile.d/* ; do
    . $i
  done
  unset i
fi
if [ -e /etc/profile ]; then
    . /etc/profile
fi
for langfile in /usr/share/defaults/etc/locale.conf /etc/locale.conf "$HOME/.i18n" ; do
	[ -f $langfile ] && . $langfile 
done
export LANG LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT LC_IDENTIFICATION

XDG_CONFIG_DIRS=/usr/share/xdg:/etc/xdg
export PATH PS1 EDITOR TERM CFLAGS CXXFLAGS CFFLAGS FFLAGS XDG_CONFIG_DIRS

umask 022
CFLAGS="-g -O3 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -fno-semantic-interposition -ffat-lto-objects -fno-trapping-math -Wl,-sort-common -Wl,--enable-new-dtags -mtune=skylake -Wa,-mbranches-within-32B-boundaries"
CLICK HERE to help move gentoo up on distrowatch.

If you like Gentoo you can thank devs here - https://www.gentoo.org/donate/
Top
C1REX
l33t
l33t
User avatar
Posts: 788
Joined: Fri Jan 02, 2004 2:07 am
Location: Poland/UK

  • Quote

Post by C1REX » Fri Feb 21, 2020 6:47 am

I've got additional comment from one of ClearLinux dev team:
s0f4r wrote:ClearLinux dev here.

This isn't the whole picture.

We compile many packages several times and in subsequent passes enable AVX2 and AVX512 acceleration, which are used on platforms that support it. If you have a platform that is haswell+ or skylake+, you will get additional benefits that are not in the flags you listed above.

Other things that affect compiled code are glibc optimizations and some linker optimizations, which don't appear in the flags above either.
CLICK HERE to help move gentoo up on distrowatch.

If you like Gentoo you can thank devs here - https://www.gentoo.org/donate/
Top
C1REX
l33t
l33t
User avatar
Posts: 788
Joined: Fri Jan 02, 2004 2:07 am
Location: Poland/UK

  • Quote

Post by C1REX » Fri Feb 21, 2020 6:52 am

Also more info:


KDE plasma specific flags:
https://github.com/clearlinux-pkgs/plas ... .spec#L184



Global flags
https://github.com/clearlinux/clr-rpm-c ... /rpmrc#L13
CLICK HERE to help move gentoo up on distrowatch.

If you like Gentoo you can thank devs here - https://www.gentoo.org/donate/
Top
erm67
l33t
l33t
User avatar
Posts: 653
Joined: Tue Nov 01, 2005 5:31 pm
Location: EU
Contact:
Contact erm67
Website

  • Quote

Post by erm67 » Fri Feb 21, 2020 7:17 am

duane wrote: I have yet to see anything crash, but it seems that (for me) the gentoo-lto overlay doesn't do much good. Everything I run that needs speed has already got about as much as it can get. Given that, I wonder if I'd get any use out of clearlinux. The review makes it sound painful to work with, and I kind of doubt it would run any faster in any useful way.
clearlinux IS painful for some people. Basically you cannot modify the packages that are installed, you install all groups of packages, update and that is all; you use what comes with it.
For people that do their work on the computer and don't play installing or optimizing packages instead it is ok.

x264 uses ASM routines, they cannot be optimized by a C compiler.

Did you notice the 30% increase in regexp? :-)

I use the optimizations on a server not on a desktop and that regexp speed increase is very nice :-)

@C1REX gentoo uses -march=native that is superior to ClearLinux cpu cflags, with native you use already more CPU specific optimizations than ClearLinux. If ClearLinux is faster the reason is somewhere else, CPU specific optimizations are not enough. expecially if with the other optimizations the regexp are 30% faster :-)

In fact that is the problem, despite the fact that gentoo uses superior cpu specific optimizations IS slower (even if a lot of gentoo user cannot accept this fact and can only react with arrogance to their failure).
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Top
duane
Apprentice
Apprentice
Posts: 193
Joined: Mon Jun 03, 2002 1:53 pm
Location: Oklahoma City
Contact:
Contact duane
Website

  • Quote

Post by duane » Sat Feb 22, 2020 8:37 pm

Ok, this is just weird, and I'm not sure I believe it myself. I've nearly finished re-emerging everything with no changes from my standard setup except using "-O3", and stopped to run a quick libc-bench, comparing it to the gentoo-lto overlay. Again, a positive value means the lto was faster - a result of -100% would mean that the lto took twice as long.

BAD DATA:

Code: Select all

b_malloc_tiny2 (0)	-85%
b_string_strstr ("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac")	-13%
b_utf8_onebyone (0)	-45%
b_pthread_uselesslock (0)	4%
b_string_strstr ("aaaaaaaaaaaaaaaaaaaaaaaaac")	-11%
b_pthread_createjoin_serial1 (0)	-70%
b_malloc_big1 (0)	-56%
b_string_strstr ("aaaaaaaaaaaaaacccccccccccc")	-12%
b_utf8_bigbuf (0)	0%
b_malloc_big2 (0)	-48%
b_malloc_bubble (0)	-38%
b_pthread_create_serial1 (0)	-33%
b_stdio_putcgetc_unlocked (0)	-2%
b_regex_search ("(a|b|c)*d*b")	1%
b_regex_compile ("(a|b|c)*d*b")	-12%
b_string_strlen (0)	-95%
b_stdio_putcgetc (0)	-42%
b_pthread_createjoin_serial2 (0)	-56%
b_string_strstr ("azbycxdwevfugthsirjqkplomn")	-11%
b_malloc_thread_local (0)	-81%
b_malloc_sparse (0)	-36%
b_regex_search ("a{25}b")	-1%
b_malloc_thread_stress (0)	-43%
b_string_strchr (0)	-85%
b_malloc_tiny1 (0)	-61%
b_string_strstr ("abcdefghijklmnopqrstuvwxyz")	-14%
b_string_memset (0)	-63%
glibc did not use "-O3". It stripped my choice at the start of the emerge. I suppose the lto overlay may have been protecting me from some other really unsafe use of "-O3", though I haven't seen any problems yet...

Edit: Disregard this. I forgot to apply the glibc update to both partitions.
Last edited by duane on Sat Feb 22, 2020 9:34 pm, edited 2 times in total.
Top
C1REX
l33t
l33t
User avatar
Posts: 788
Joined: Fri Jan 02, 2004 2:07 am
Location: Poland/UK

  • Quote

Post by C1REX » Sat Feb 22, 2020 8:45 pm

Duane - what are we looking at? Do this numbers suggest you managed to even double the performance in some cases?
CLICK HERE to help move gentoo up on distrowatch.

If you like Gentoo you can thank devs here - https://www.gentoo.org/donate/
Top
duane
Apprentice
Apprentice
Posts: 193
Joined: Mon Jun 03, 2002 1:53 pm
Location: Oklahoma City
Contact:
Contact duane
Website

  • Quote

Post by duane » Sat Feb 22, 2020 8:56 pm

C1REX wrote:Duane - what are we looking at? Do this numbers suggest you managed to even double the performance in some cases?
I should have said that a result of -100% would mean that gentoo-lto took twice as long to do the same test, and vice versa.
Top
C1REX
l33t
l33t
User avatar
Posts: 788
Joined: Fri Jan 02, 2004 2:07 am
Location: Poland/UK

  • Quote

Post by C1REX » Sat Feb 22, 2020 9:27 pm

I'm still missing some info.

This results show what exactly? That your newly re-emerged build is much slower comparing to what?
CLICK HERE to help move gentoo up on distrowatch.

If you like Gentoo you can thank devs here - https://www.gentoo.org/donate/
Top
duane
Apprentice
Apprentice
Posts: 193
Joined: Mon Jun 03, 2002 1:53 pm
Location: Oklahoma City
Contact:
Contact duane
Website

  • Quote

Post by duane » Sat Feb 22, 2020 9:35 pm

Doh! I forgot to apply the recent glibc update to both partitions. That probably explains most of the difference. I'll have to fix that after I finish emerging on this side.
Top
Post Reply
  • Print view

54 posts
  • 1
  • 2
  • 3
  • Next

Return to “Gentoo Chat”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic