NTU Apprentice
Joined: 17 Jul 2015 Posts: 187
|
Posted: Mon Sep 26, 2016 12:04 am Post subject: -fgcse-las GCC option? |
|
|
Hello all! I've been experimenting with Graphite/ISL on GCC 5.4, it's rock solid as far as I can tell so far. Compiling the kernel itself and Firefox with the new options has actually dramatically reduced CPU usage during runtime. The overhead of the browser has dropped substantially. I've been going through the GCC docs and some of the source and came across -fgcse-las. Any idea why this option isn't enabled at -O3? Judging from the description in their documentation it doesn't seem that dangerous. Does anyone here use it? Thoughts, noticeable changes? I'm assuming this should only impact compile time, just as -fgcse-after-reload, correct? Worst case scenario it has no effect at startup / runtime? Or has this option been found to be too aggressive and actually break something?
Here's a snippet of my make.conf:
Code: |
# Cherry-picked from -O3 which seem to not globally cause needless CPU cycles
# Can -fgcse-las safely go here?
OPT="-fpredictive-commoning -fgcse-after-reload -fvect-cost-model -ftree-partial-pre"
# -funsafe-math-optimizations enables our vectorization friend, -fassociative-math, and it's requirements.
# -fassociative-math allows re-ordering of operands to further along auto vectorization without
# all the other unsafe optimizations that come with -ffast-math.
VECOPT="-ftree-vectorize -funsafe-math-optimizations"
# ISL / Graphite optimization flags without blindly parallelizing everything
# -floop-nest-optimize is the new -ftree-loop-linear -floop-interchange etc etc
ISLOPT="-floop-nest-optimize -fgraphite-identity"
CFLAGS="-march=core2 -O2 -pipe -fomit-frame-pointer ${OPT} ${VECOPT} ${ISLOPT}"
CXXFLAGS="${CFLAGS}"
|
To be clear, I am not a Gentoo ricer, I have very carefully selected these options based on the behavior of these options. While you may call me an idiot for enabling -funsafe-math-optimizations, there are many scenarios (namely ones including floating point) where GCC's auto vectorizer will do nothing. If the FP code is written in the order GCC expects it in it's vectorization pass, I have not tested this but only then will it vectorize those functions. |
|