View previous topic :: View next topic |
CFLAGS -O2 or -O3? |
O2 |
|
88% |
[ 31 ] |
O3 |
|
11% |
[ 4 ] |
other |
|
0% |
[ 0 ] |
|
Total Votes : 35 |
|
Author |
Message |
Garbanzo n00b
Joined: 06 Aug 2018 Posts: 37
|
Posted: Wed Apr 08, 2020 3:26 am Post subject: CFLAGS Optimazation Level |
|
|
What CFLAGS optimization level do you use in make.conf?
Me? Against better wisdom, I've been using O3 for months now and so far haven't seen any trouble. |
|
Back to top |
|
|
Ionen Developer
Joined: 06 Dec 2018 Posts: 2719
|
Posted: Wed Apr 08, 2020 3:52 am Post subject: |
|
|
Shame -O3 makes no noticeable difference, and in some cases could even make code both slower and bigger. About all it's good for is exposing potential bugs in code which lead to zealous use of strip/replace-flags() so gentoo users throwing it in there don't run into issues too often. |
|
Back to top |
|
|
Jaglover Watchman
Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
|
Ionen Developer
Joined: 06 Dec 2018 Posts: 2719
|
Posted: Wed Apr 08, 2020 4:00 am Post subject: |
|
|
Jaglover wrote: | -O9 of course, and don't forget to turn on the secret undocumented -omg-optimized optimization. And -fuck-upstream can work wonders, too. | Forget everything I've said, I'm converted. Going to rebuild @world |
|
Back to top |
|
|
AJM Apprentice
Joined: 25 Sep 2002 Posts: 189 Location: Aberdeen, Scotland
|
Posted: Wed Apr 08, 2020 3:33 pm Post subject: Re: CFLAGS Optimazation Level |
|
|
Garbanzo wrote: | What CFLAGS optimization level do you use in make.conf?
Me? Against better wisdom, I've been using O3 for months now and so far haven't seen any trouble. |
I've stuck with 02 ever since doing extensive benchmarking nearly 20 years ago on CFD related code - 03 consistently made things run much more slowly and made the executables significantly larger.
Obviously it'd be extrapolating too much to say that it always does that with all code, 20 years later - but 02 works fine, so I'll stick with it. |
|
Back to top |
|
|
mike155 Advocate
Joined: 17 Sep 2010 Posts: 4438 Location: Frankfurt, Germany
|
Posted: Wed Apr 08, 2020 4:04 pm Post subject: |
|
|
https://wiki.gentoo.org/wiki/GCC_optimization says:
Quote: | -O3: the highest level of optimization possible. It enables optimizations that are expensive in terms of compile time and memory usage. Compiling with -O3 is not a guaranteed way to improve performance, and in fact, in many cases, can slow down a system due to larger binaries and increased memory usage. -O3 is also known to break several packages. Using -O3 is not recommended. However, it also enables -ftree-vectorize so that loops in the code get vectorized and will use AVX YMM registers. |
|
|
Back to top |
|
|
CaptainBlood Advocate
Joined: 24 Jan 2010 Posts: 3627
|
Posted: Wed Apr 08, 2020 4:09 pm Post subject: |
|
|
Jaglover wrote: | -O9 of course, and don't forget to turn on the secret undocumented -omg-optimized optimization. And -fuck-upstream can work wonders, too. | +1 |
|
Back to top |
|
|
ff11 l33t
Joined: 10 Mar 2014 Posts: 664
|
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1249
|
Posted: Thu Apr 09, 2020 12:39 am Post subject: |
|
|
Quote: | -ftree-vectorize
Perform vectorization on trees. This flag enables -ftree-loop-vectorize and -ftree-slp-vectorize if not explicitly specified. |
I mean whether it is excessive to write -ftree-slp-vectorize if -ftree-vectorize is defined of Phoronix is right defining -ftree-slp-vectorize along with -ftree-vectorize. |
|
Back to top |
|
|
Garbanzo n00b
Joined: 06 Aug 2018 Posts: 37
|
Posted: Thu Apr 09, 2020 4:29 am Post subject: |
|
|
Interesting. There is definitely contradictory information out there regarding O2 vs O3, Most benchmarks seem to show O3 is marginally faster. I wanted to see for myself so on a GCC upgrade I flipped it to O3 to see what would happen - expecting to flip it back after it went down in flames. However it was anti-climatic - nothing at all happened, no noticeable difference in performance, stability, or build times. So I just never bothered to change it back.
What do you think the long term trend is? To me it seems that GCC is getting better about having fewer O3 cases that are slower than O2. |
|
Back to top |
|
|
Perfect Gentleman Veteran
Joined: 18 May 2014 Posts: 1249
|
Posted: Thu Apr 09, 2020 5:15 am Post subject: |
|
|
I use -O2, -ftree-vectorize & -flto. Afaik, lots of builds tend to be broken with -O3 & -flto.
Last edited by Perfect Gentleman on Thu Apr 09, 2020 5:52 am; edited 1 time in total |
|
Back to top |
|
|
erm67 l33t
Joined: 01 Nov 2005 Posts: 653 Location: EU
|
Posted: Thu Apr 09, 2020 5:43 am Post subject: |
|
|
-ftree-vectorize -ftree-slp-vectorizes should be enabled at -O2 soon in gcc
-O3 sometimes produces slower programs it is not worth.
I still use, as always, -O2 -ftree-vectorize -ftree-slp-vectorize -march=native + some graphite, no problems at all, with the performance boost php IS noticeable _________________ Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia
My fediverse account: @erm67@erm67.dynu.net |
|
Back to top |
|
|
Ionen Developer
Joined: 06 Dec 2018 Posts: 2719
|
Posted: Thu Apr 09, 2020 6:07 am Post subject: |
|
|
erm67 wrote: | -ftree-vectorize -ftree-slp-vectorizes should be enabled at -O2 soon in gcc | Soon is likely a while still, the idea was thrown around a bit over a year ago but upcoming gcc 10.0.1 doesn't enable it and I doubt it'll be flipped in minor versions so it'll likely wait until gcc 11+ (if ever), but don't quote me on that.
As others pointed, specifying slp isn't necessary since tree-vec is a shortcut to enable loop+slp, or at least for gcc 9+ (just check if in doubt) Code: | $ gcc-9.3.0 -ftree-vectorize -Q --help=optimize | grep vectorize
-ftree-loop-vectorize [enabled]
-ftree-slp-vectorize [enabled]
$ gcc-10.0.1pre -O2 -Q --help=optimize | grep vectorize
-ftree-loop-vectorize [disabled]
-ftree-slp-vectorize [disabled] |
Edit: gcc 10 does bring us -fno-common by default though, just look at this wall of fun |
|
Back to top |
|
|
erm67 l33t
Joined: 01 Nov 2005 Posts: 653 Location: EU
|
Posted: Thu Apr 09, 2020 9:02 am Post subject: |
|
|
Ionen wrote: | erm67 wrote: |
As others pointed, specifying slp isn't necessary since tree-vec is a shortcut to enable loop+slp, or at least for gcc 9+ (just check if in doubt)[code]$ gcc-9.3.0 -ftree-vectorize -Q --help=optimize | grep vectorize
|
Quote: | Basic block vectorization, aka SLP, is enabled by the flag -ftree-slp-vectorize, and requires the same platform dependent flags as loop vectorization. Basic block SLP is enabled by default at -O3 and when -ftree-vectorize is enabled.
|
|
That also what the doc say, I enabled it a long time ago, it doesn't hurt anyway.
I have no time to switch to gcc10 now better wait until fedora32 (with gcc10) is mature, by then most problems will be fixed (not only by redhat of course)
podman is a lot more interesting for the moment. _________________ Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia
My fediverse account: @erm67@erm67.dynu.net |
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
Posted: Thu Apr 09, 2020 11:03 am Post subject: |
|
|
erm67 wrote: | I have no time to switch to gcc10 now better wait until fedora32 (with gcc10) is mature, by then most problems will be fixed (not only by redhat of course) |
I would never wait for any fix on gcc coming from fedora or redhat, you have forgotten their insane 2.96 version?
edit: add a link for the story (yeah some are not enough old to know it maybe) https://gcc.gnu.org/gcc-2.96.html |
|
Back to top |
|
|
erm67 l33t
Joined: 01 Nov 2005 Posts: 653 Location: EU
|
Posted: Thu Apr 09, 2020 11:17 am Post subject: |
|
|
krinn wrote: | erm67 wrote: | I have no time to switch to gcc10 now better wait until fedora32 (with gcc10) is mature, by then most problems will be fixed (not only by redhat of course) |
I would never wait for any fix on gcc coming from fedora or redhat, you have forgotten their insane 2.96 version?
edit: add a link for the story (yeah some are not enough old to know it maybe) https://gcc.gnu.org/gcc-2.96.html |
Well actually I wait for the mass rebuild with gcc10 once the beta is stable, at least I hope it is scheduled also for fedora32, it is their emerge -e world . Usually it happens a lot earlier than when an effortless emerge -e world works so I can bump what is needed or import some patches.
Recently after lots of complaint also rpmfusion ports to the new compiler already at the beta stage. _________________ Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia
My fediverse account: @erm67@erm67.dynu.net |
|
Back to top |
|
|
|