Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
CFLAGS Optimazation Level
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  

CFLAGS -O2 or -O3?
O2
88%
 88%  [ 31 ]
O3
11%
 11%  [ 4 ]
other
0%
 0%  [ 0 ]
Total Votes : 35

Author Message
Garbanzo
n00b
n00b


Joined: 06 Aug 2018
Posts: 37

PostPosted: Wed Apr 08, 2020 3:26 am    Post subject: CFLAGS Optimazation Level Reply with quote

What CFLAGS optimization level do you use in make.conf?

Me? Against better wisdom, I've been using O3 for months now and so far haven't seen any trouble.
Back to top
View user's profile Send private message
Ionen
Developer
Developer


Joined: 06 Dec 2018
Posts: 2719

PostPosted: Wed Apr 08, 2020 3:52 am    Post subject: Reply with quote

Shame -O3 makes no noticeable difference, and in some cases could even make code both slower and bigger. About all it's good for is exposing potential bugs in code which lead to zealous use of strip/replace-flags() so gentoo users throwing it in there don't run into issues too often.
Back to top
View user's profile Send private message
Jaglover
Watchman
Watchman


Joined: 29 May 2005
Posts: 8291
Location: Saint Amant, Acadiana

PostPosted: Wed Apr 08, 2020 3:56 am    Post subject: Reply with quote

-O9 of course, and don't forget to turn on the secret undocumented -omg-optimized optimization. And -fuck-upstream can work wonders, too.
_________________
My Gentoo installation notes.
Please learn how to denote units correctly!
Back to top
View user's profile Send private message
Ionen
Developer
Developer


Joined: 06 Dec 2018
Posts: 2719

PostPosted: Wed Apr 08, 2020 4:00 am    Post subject: Reply with quote

Jaglover wrote:
-O9 of course, and don't forget to turn on the secret undocumented -omg-optimized optimization. And -fuck-upstream can work wonders, too.
Forget everything I've said, I'm converted. Going to rebuild @world 8)
Back to top
View user's profile Send private message
AJM
Apprentice
Apprentice


Joined: 25 Sep 2002
Posts: 189
Location: Aberdeen, Scotland

PostPosted: Wed Apr 08, 2020 3:33 pm    Post subject: Re: CFLAGS Optimazation Level Reply with quote

Garbanzo wrote:
What CFLAGS optimization level do you use in make.conf?
Me? Against better wisdom, I've been using O3 for months now and so far haven't seen any trouble.


I've stuck with 02 ever since doing extensive benchmarking nearly 20 years ago on CFD related code - 03 consistently made things run much more slowly and made the executables significantly larger.

Obviously it'd be extrapolating too much to say that it always does that with all code, 20 years later - but 02 works fine, so I'll stick with it.
Back to top
View user's profile Send private message
mike155
Advocate
Advocate


Joined: 17 Sep 2010
Posts: 4438
Location: Frankfurt, Germany

PostPosted: Wed Apr 08, 2020 4:04 pm    Post subject: Reply with quote

https://wiki.gentoo.org/wiki/GCC_optimization says:
Quote:
-O3: the highest level of optimization possible. It enables optimizations that are expensive in terms of compile time and memory usage. Compiling with -O3 is not a guaranteed way to improve performance, and in fact, in many cases, can slow down a system due to larger binaries and increased memory usage. -O3 is also known to break several packages. Using -O3 is not recommended. However, it also enables -ftree-vectorize so that loops in the code get vectorized and will use AVX YMM registers.
Back to top
View user's profile Send private message
CaptainBlood
Advocate
Advocate


Joined: 24 Jan 2010
Posts: 3627

PostPosted: Wed Apr 08, 2020 4:09 pm    Post subject: Reply with quote

Jaglover wrote:
-O9 of course, and don't forget to turn on the secret undocumented -omg-optimized optimization. And -fuck-upstream can work wonders, too.
+1
Back to top
View user's profile Send private message
ff11
l33t
l33t


Joined: 10 Mar 2014
Posts: 664

PostPosted: Wed Apr 08, 2020 4:37 pm    Post subject: Reply with quote

Let's put some benchmark here, with the gcc9: https://www.phoronix.com/scan.php?page=article&item=gcc9-core9-tuning

EDIT:
By the way, I feel like this too.
_________________
| Proverbs 26:12 |
| There is more hope for a fool than for a wise man that are wise in his own eyes. |
* AlphaGo - The Movie - Full Documentary "I want to apologize for being so powerless" - Lee
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1249

PostPosted: Thu Apr 09, 2020 12:39 am    Post subject: Reply with quote

Quote:
-ftree-vectorize
Perform vectorization on trees. This flag enables -ftree-loop-vectorize and -ftree-slp-vectorize if not explicitly specified.

I mean whether it is excessive to write -ftree-slp-vectorize if -ftree-vectorize is defined of Phoronix is right defining -ftree-slp-vectorize along with -ftree-vectorize.
Back to top
View user's profile Send private message
Garbanzo
n00b
n00b


Joined: 06 Aug 2018
Posts: 37

PostPosted: Thu Apr 09, 2020 4:29 am    Post subject: Reply with quote

Interesting. There is definitely contradictory information out there regarding O2 vs O3, Most benchmarks seem to show O3 is marginally faster. I wanted to see for myself so on a GCC upgrade I flipped it to O3 to see what would happen - expecting to flip it back after it went down in flames. However it was anti-climatic - nothing at all happened, no noticeable difference in performance, stability, or build times. So I just never bothered to change it back.

What do you think the long term trend is? To me it seems that GCC is getting better about having fewer O3 cases that are slower than O2.
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1249

PostPosted: Thu Apr 09, 2020 5:15 am    Post subject: Reply with quote

I use -O2, -ftree-vectorize & -flto. Afaik, lots of builds tend to be broken with -O3 & -flto.

Last edited by Perfect Gentleman on Thu Apr 09, 2020 5:52 am; edited 1 time in total
Back to top
View user's profile Send private message
erm67
l33t
l33t


Joined: 01 Nov 2005
Posts: 653
Location: EU

PostPosted: Thu Apr 09, 2020 5:43 am    Post subject: Reply with quote

-ftree-vectorize -ftree-slp-vectorizes should be enabled at -O2 soon in gcc

-O3 sometimes produces slower programs it is not worth.


I still use, as always, -O2 -ftree-vectorize -ftree-slp-vectorize -march=native + some graphite, no problems at all, with the performance boost php IS noticeable
_________________
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Back to top
View user's profile Send private message
Ionen
Developer
Developer


Joined: 06 Dec 2018
Posts: 2719

PostPosted: Thu Apr 09, 2020 6:07 am    Post subject: Reply with quote

erm67 wrote:
-ftree-vectorize -ftree-slp-vectorizes should be enabled at -O2 soon in gcc
Soon is likely a while still, the idea was thrown around a bit over a year ago but upcoming gcc 10.0.1 doesn't enable it and I doubt it'll be flipped in minor versions so it'll likely wait until gcc 11+ (if ever), but don't quote me on that.

As others pointed, specifying slp isn't necessary since tree-vec is a shortcut to enable loop+slp, or at least for gcc 9+ (just check if in doubt)
Code:
$ gcc-9.3.0 -ftree-vectorize -Q --help=optimize | grep vectorize
  -ftree-loop-vectorize             [enabled]
  -ftree-slp-vectorize              [enabled]

$ gcc-10.0.1pre -O2 -Q --help=optimize | grep vectorize
  -ftree-loop-vectorize             [disabled]
  -ftree-slp-vectorize              [disabled]

Edit: gcc 10 does bring us -fno-common by default though, just look at this wall of fun
Back to top
View user's profile Send private message
erm67
l33t
l33t


Joined: 01 Nov 2005
Posts: 653
Location: EU

PostPosted: Thu Apr 09, 2020 9:02 am    Post subject: Reply with quote

Ionen wrote:
erm67 wrote:

As others pointed, specifying slp isn't necessary since tree-vec is a shortcut to enable loop+slp, or at least for gcc 9+ (just check if in doubt)[code]$ gcc-9.3.0 -ftree-vectorize -Q --help=optimize | grep vectorize


Quote:
Basic block vectorization, aka SLP, is enabled by the flag -ftree-slp-vectorize, and requires the same platform dependent flags as loop vectorization. Basic block SLP is enabled by default at -O3 and when -ftree-vectorize is enabled.


That also what the doc say, I enabled it a long time ago, it doesn't hurt anyway.

I have no time to switch to gcc10 now :-) better wait until fedora32 (with gcc10) is mature, by then most problems will be fixed (not only by redhat of course) ;-)

podman is a lot more interesting for the moment.
_________________
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Thu Apr 09, 2020 11:03 am    Post subject: Reply with quote

erm67 wrote:
I have no time to switch to gcc10 now :-) better wait until fedora32 (with gcc10) is mature, by then most problems will be fixed (not only by redhat of course) ;-)

I would never wait for any fix on gcc coming from fedora or redhat, you have forgotten their insane 2.96 version? :D
edit: add a link for the story (yeah some are not enough old to know it maybe) https://gcc.gnu.org/gcc-2.96.html
Back to top
View user's profile Send private message
erm67
l33t
l33t


Joined: 01 Nov 2005
Posts: 653
Location: EU

PostPosted: Thu Apr 09, 2020 11:17 am    Post subject: Reply with quote

krinn wrote:
erm67 wrote:
I have no time to switch to gcc10 now :-) better wait until fedora32 (with gcc10) is mature, by then most problems will be fixed (not only by redhat of course) ;-)

I would never wait for any fix on gcc coming from fedora or redhat, you have forgotten their insane 2.96 version? :D
edit: add a link for the story (yeah some are not enough old to know it maybe) https://gcc.gnu.org/gcc-2.96.html


Well actually I wait for the mass rebuild with gcc10 once the beta is stable, at least I hope it is scheduled also for fedora32, it is their emerge -e world ;-). Usually it happens a lot earlier than when an effortless emerge -e world works so I can bump what is needed or import some patches.
Recently after lots of complaint also rpmfusion ports to the new compiler already at the beta stage.
_________________
Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia

My fediverse account: @erm67@erm67.dynu.net
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum