Wine break easily with optimizations in obscure and hard to diagnose ways at run time, so gentoo's strip-flags() still makes sense (in this case, gentoo does provide a custom-cflags USE to override it normally). May be better off looking at optional patches/features if want a performance boost on Wine. I notably added fsync to my gentoo's wine-stagingduane wrote:However, I did notice that a number of packages didn't use some or any of the optimized settings. Wine doesn't appear to get much out of the overlay. I still play a few windows games, so that's a bit disappointing.
Of course, you're the one who gave me the idea. It's definitely more pickup than I expected, but not enough to justify any serious problems, if there are any. : )C1REX wrote:I'm looking forward to hear some tests results from you.
19% speed increase is massive! To get such increase just by improving hardware would cost like what? 50% more?
No problems. All my important data is backed up. The complete "emerge -e" just finished this morning. So far, everything is running very well. The system doesn't feel any faster, but I didn't really expect it to. Software emerges noticeably slower, but I haven't timed it yet -- that isn't a major issue for me anyway.erm67 wrote:Be prepared to recover, just in case.
What did you enabled? full -O3 or a subset? an what about the other flags?
Did you already emerge -e world with gentoo-lto?
Code: Select all
Portage 2.3.84 (python 3.6.9-final-0, default/linux/amd64/17.1, gcc-9.2.0, glibc-2.29-r7, 5.5.4-gentoo x86_64)
=================================================================
System uname: Linux-5.5.4-gentoo-x86_64-AMD_Ryzen_5_3500U_with_Radeon_Vega_Mobile_Gfx-with-gentoo-2.6
KiB Mem: 6081204 total, 3431804 free
KiB Swap: 4194300 total, 4121852 free
Timestamp of repository gentoo: Sun, 16 Feb 2020 05:00:01 +0000
Head commit of repository gentoo: 8e60f798cbdd167ba63306b1a00bc20a5ffc7fe0
sh bash 4.4_p23-r1
ld GNU ld (Gentoo 2.32 p2) 2.32.0
ccache version 3.7.7 [disabled]
app-shells/bash: 4.4_p23-r1::gentoo
dev-java/java-config: 2.2.0-r4::gentoo
dev-lang/perl: 5.30.1::gentoo
dev-lang/python: 2.7.17::gentoo, 3.6.9::mv, 3.7.5-r1::mv
dev-util/ccache: 3.7.7::gentoo
dev-util/cmake: 3.14.6::gentoo
sys-apps/baselayout: 2.6-r1::gentoo
sys-apps/openrc: 0.42.1::gentoo
sys-apps/sandbox: 2.13::gentoo
sys-devel/autoconf: 2.13-r1::gentoo, 2.69-r4::gentoo
sys-devel/automake: 1.16.1-r1::gentoo
sys-devel/binutils: 2.32-r1::gentoo
sys-devel/gcc: 9.2.0-r2::gentoo
sys-devel/gcc-config: 2.2::gentoo
sys-devel/libtool: 2.4.6-r6::gentoo
sys-devel/make: 4.2.1-r4::gentoo
sys-kernel/linux-headers: 4.19::gentoo (virtual/os-headers)
sys-libs/glibc: 2.29-r7::gentoo
Repositories:
gentoo
location: /var/db/repos/gentoo
sync-type: rsync
sync-uri: rsync://rsync.gentoo.org/gentoo-portage
priority: -1000
sync-rsync-extra-opts:
sync-rsync-verify-jobs: 1
sync-rsync-verify-metamanifest: yes
sync-rsync-verify-max-age: 24
duane
location: /var/db/repos/duane
masters: gentoo
lto-overlay
location: /var/lib/layman/lto-overlay
sync-type: laymansync
sync-uri: https://github.com/InBetweenNames/gentooLTO.git
masters: gentoo mv
priority: 50
mv
location: /var/lib/layman/mv
sync-type: laymansync
sync-uri: https://anongit.gentoo.org/git/user/mv.git
masters: gentoo
priority: 50
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="@FREE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -fuse-linker-plugin -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -fuse-linker-plugin -pipe"
DISTDIR="/var/cache/distfiles"
ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -fuse-linker-plugin -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-march=native -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=4 -fuse-linker-plugin -pipe"
GENTOO_MIRRORS="http://gentoo.osuosl.org/ http://www.gtlib.gatech.edu/pub/gentoo http://gentoo.cs.utah.edu/"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j4"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="X acl acpi alsa amd64 berkdb bzip2 cli crypt cxx d3d9 dri fortran gdbm gif graphite iconv ipv6 jpeg libtirpc lto lua mp3 multilib ncurses nls nptl ogg openmp pam pcre png readline seccomp split-usr ssl svg tcpd tiff truetype unicode vaapi vdpau vorbis xattr zlib" ABI_X86="64 32" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt sha sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-2" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_6" QEMU_SOFTMMU_TARGETS="i386 x86_64" QEMU_USER_TARGETS="i386 x86_64" RUBY_TARGETS="ruby24 ruby25" USERLAND="GNU" VIDEO_CARDS="amdgpu radeonsi virgl" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset: CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Code: Select all
CFLAGS="-g -O3 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -Wl,–copy-dt-needed-entries -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -fno-semantic-interposition -ffat-lto-objects -fno-signed-zeros -fno-trapping-math -fassociative-math -Wl,-sort-common -Wl,–enable-new-dtags -mtune=skylake"
Code: Select all
b_malloc_tiny2 (0) 2%
b_string_strstr ("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac") -1%
b_utf8_onebyone (0) -7%
b_pthread_uselesslock (0) -3%
b_string_strstr ("aaaaaaaaaaaaaaaaaaaaaaaaac") -2%
b_pthread_createjoin_serial1 (0) -5%
b_malloc_big1 (0) 1%
b_string_strstr ("aaaaaaaaaaaaaacccccccccccc") -2%
b_utf8_bigbuf (0) 3%
b_malloc_big2 (0) 1%
b_malloc_bubble (0) 1%
b_pthread_create_serial1 (0) 10%
b_stdio_putcgetc_unlocked (0) -1%
b_regex_search ("(a|b|c)*d*b") 30%
b_regex_compile ("(a|b|c)*d*b") -3%
b_string_strlen (0) -2%
b_stdio_putcgetc (0) 4%
b_pthread_createjoin_serial2 (0) 2%
b_string_strstr ("azbycxdwevfugthsirjqkplomn") -2%
b_malloc_thread_local (0) 0%
b_malloc_sparse (0) 1%
b_regex_search ("a{25}b") 20%
b_malloc_thread_stress (0) -1%
b_string_strchr (0) -1%
b_malloc_tiny1 (0) 1%
b_string_strstr ("abcdefghijklmnopqrstuvwxyz") -2%
b_string_memset (0) -1%Here's an article with a lot of the settings on the first page:CaptainBlood wrote:C1REX, could you please provide your source(s)?
Thks 4 ur attention, interest & support.
Code: Select all
# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))
# and Bourne compatible shells (bash(1), ksh(1), ash(1), ...).
PATH="/usr/bin"
if grep -q "flags.*:.* avx512bw" /proc/cpuinfo; then
PATH="/usr/bin/haswell/avx512_1:/usr/bin/haswell:/usr/bin"
elif grep -q "flags.*:.* fma .* avx2" /proc/cpuinfo; then
PATH="/usr/bin/haswell:/usr/bin"
fi
PATH="/usr/local/bin:$PATH:/opt/3rd-party/bin"
if [[ -x "/bin/vi" ]]; then
EDITOR="/usr/bin/vi" # needed for packages like cron
else
EDITOR="/usr/bin/nano" # needed for packages like cron
fi
test -z "$TERM" && TERM="xterm" # Basic terminal capab. For screen etc.
# Ensure the interactive terminal has rows and columns
if [ -n "$PS1" ] && tty > /dev/null; then # Only for interactive shells
tty_rows=$(stty size | cut -d' ' -f1)
tty_columns=$(stty size | cut -d' ' -f2)
if [ $((${tty_rows} + 0)) -le 0 -o $((${tty_columns} + 0)) -le 0 ]; then
[ -x /usr/bin/setterm ] && setterm --resize
# fail safe if size still not set
tty_rows=$(stty size | cut -d' ' -f1)
tty_columns=$(stty size | cut -d' ' -f2)
if [ $((${tty_rows} + 0)) -le 0 -o $((${tty_columns} + 0)) -le 0 ]; then
stty rows 24 columns 80
fi
fi
unset tty_rows tty_columns
fi
if [ ! -e /etc/localtime ]; then
TZ="UTC" # Time Zone. Look at http://theory.uwinnipeg.ca/gnu/glibc/libc_303.html
# for an explanation of how to set this to your local timezone.
export TZ
fi
if [ -z "$PS1" ]; then
# works for bash and ash (no other shells known to be in use here)
PS1='\u@\h:\w\$ '
fi
CFLAGS="-g -O3 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -fno-semantic-interposition -ffat-lto-objects -fno-trapping-math -Wl,-sort-common -Wl,--enable-new-dtags -mtune=skylake -Wa,-mbranches-within-32B-boundaries"
FFLAGS="-g -O3 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -malign-data=abi -fno-semantic-interposition -ftree-vectorize -ftree-loop-vectorize -Wl,--enable-new-dtags -Wa,-mbranches-within-32B-boundaries "
CFFLAGS="-g -O3 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -malign-data=abi -fno-semantic-interposition -ftree-vectorize -ftree-loop-vectorize -Wl,-sort-common -Wl,--enable-new-dtags "
CXXFLAGS="$CFLAGS -fvisibility-inlines-hidden -Wl,--enable-new-dtags "
export AR=gcc-ar
export RANLIB=gcc-ranlib
export NM=gcc-nm
export LA_VERSION="OpenBLAS"
export LA_LIBS=/usr/lib64/libopenblas.so.0
export LA_INCLUDE=/usr/include
export LA_PATH=/usr/lib64/
export MPI_CC=/usr/bin/mpicc
export MPI_LIBS=/usr/lib64/libmpi.so
export MPI_INCLUDE=/usr/include/
export MPI_PATH=/usr/lib64/
export MPI_VERSION=3.2
export THEANO_FLAGS='floatX=float32,openmp=true,gcc.cxxflags="-ftree-vectorize -mavx"'
export CC=gcc
export CXX=g++
export PYTHONIOENCODING=utf-8:surrogateescape
export MESA_GLSL_CACHE_DISABLE=0
export GTK_IM_MODULE="ibus"
if [ -d /usr/share/defaults/etc/profile.d ]; then
for i in /usr/share/defaults/etc/profile.d/* ; do
. $i
done
unset i
fi
if [ -d /etc/profile.d ]; then
for i in /etc/profile.d/* ; do
. $i
done
unset i
fi
if [ -e /etc/profile ]; then
. /etc/profile
fi
for langfile in /usr/share/defaults/etc/locale.conf /etc/locale.conf "$HOME/.i18n" ; do
[ -f $langfile ] && . $langfile
done
export LANG LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT LC_IDENTIFICATION
XDG_CONFIG_DIRS=/usr/share/xdg:/etc/xdg
export PATH PS1 EDITOR TERM CFLAGS CXXFLAGS CFFLAGS FFLAGS XDG_CONFIG_DIRS
umask 022
s0f4r wrote:ClearLinux dev here.
This isn't the whole picture.
We compile many packages several times and in subsequent passes enable AVX2 and AVX512 acceleration, which are used on platforms that support it. If you have a platform that is haswell+ or skylake+, you will get additional benefits that are not in the flags you listed above.
Other things that affect compiled code are glibc optimizations and some linker optimizations, which don't appear in the flags above either.
clearlinux IS painful for some people. Basically you cannot modify the packages that are installed, you install all groups of packages, update and that is all; you use what comes with it.duane wrote: I have yet to see anything crash, but it seems that (for me) the gentoo-lto overlay doesn't do much good. Everything I run that needs speed has already got about as much as it can get. Given that, I wonder if I'd get any use out of clearlinux. The review makes it sound painful to work with, and I kind of doubt it would run any faster in any useful way.
Code: Select all
b_malloc_tiny2 (0) -85%
b_string_strstr ("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac") -13%
b_utf8_onebyone (0) -45%
b_pthread_uselesslock (0) 4%
b_string_strstr ("aaaaaaaaaaaaaaaaaaaaaaaaac") -11%
b_pthread_createjoin_serial1 (0) -70%
b_malloc_big1 (0) -56%
b_string_strstr ("aaaaaaaaaaaaaacccccccccccc") -12%
b_utf8_bigbuf (0) 0%
b_malloc_big2 (0) -48%
b_malloc_bubble (0) -38%
b_pthread_create_serial1 (0) -33%
b_stdio_putcgetc_unlocked (0) -2%
b_regex_search ("(a|b|c)*d*b") 1%
b_regex_compile ("(a|b|c)*d*b") -12%
b_string_strlen (0) -95%
b_stdio_putcgetc (0) -42%
b_pthread_createjoin_serial2 (0) -56%
b_string_strstr ("azbycxdwevfugthsirjqkplomn") -11%
b_malloc_thread_local (0) -81%
b_malloc_sparse (0) -36%
b_regex_search ("a{25}b") -1%
b_malloc_thread_stress (0) -43%
b_string_strchr (0) -85%
b_malloc_tiny1 (0) -61%
b_string_strstr ("abcdefghijklmnopqrstuvwxyz") -14%
b_string_memset (0) -63%