Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Status of Graphite
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
drwook
Veteran
Veteran


Joined: 30 Mar 2005
Posts: 1322
Location: London

PostPosted: Sun Mar 25, 2012 8:38 am    Post subject: Status of Graphite Reply with quote

Hi all, Silly question perhaps - but does anyone know the "official line" on Graphite support? Is it still "you break it, you fix it"? :) I'm assuming it is

Another question - is anyone using Graphite generally, and have any objective or subjective info on stability, compatibility and performance? (i.e. what compiles but is unstable, what will fail to compile, and whether it's even in anyone's interest to use it if it does work)

Been running gcc-4.6.2 for a while without significant issue, but eyeing up 4.7.0 now too... I can only find opinion/info relating to graphite on 4.4/4.5 from searching.
Back to top
View user's profile Send private message
BoneKracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1488
Location: U.S.A.

PostPosted: Mon Mar 26, 2012 5:43 am    Post subject: Reply with quote

I used for months. I used on a ~x86 desktop, a hardened ~x86 server, and a hardened x86 firewall/router. I didn't notice any perceptible improvement in performance, although I did no benchmarking at all. Neither did I check to see what impact, if any, it had on compilation times. I don't recall having any problems (might have been a bug or two where applications needed to be patched). I have since removed it, but might enable it in the future. In general, I think it's a good concept.
_________________
Oldthinkers unbellyfeel INGSOC.
-- Headline of a document on Winston Smith's terminal in his cubicle at the Ministry of Truth, seen briefly in the background in one scene of the movie rendition of Nineteen Eighty-Four.
Back to top
View user's profile Send private message
Ant P.
Veteran
Veteran


Joined: 18 Apr 2009
Posts: 1920
Location: UK

PostPosted: Mon Mar 26, 2012 6:13 pm    Post subject: Reply with quote

I've compiled everything on my 3 systems with graphite cflags. Stability seems fine, at least.
Back to top
View user's profile Send private message
jtshs256
n00b
n00b


Joined: 25 Mar 2011
Posts: 17

PostPosted: Tue Mar 27, 2012 12:33 am    Post subject: Reply with quote

I have used graphite for at least a year with 4.5* & 4.6*. It doesn't cause any compiling problem as well as significant performance improvement. You can enable the graphite flags globally.
Back to top
View user's profile Send private message
BoneKracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1488
Location: U.S.A.

PostPosted: Tue Mar 27, 2012 1:22 am    Post subject: Reply with quote

I was selective in the flags I chose to enable globally. As I recall, one or two available at the time seemed like they would too often have negative performance consequences. I would be sure to read the gcc documentation and understand what each flag actually does.

As I understood it at the time, some of these flags cause the compiler to evaluate code and selectively apply loop optimizations. However, as I recall (and I may not be accurate) one or two of the graphite-related flags available at the time seemed more ruthless, and may not be appropriate globally (i.e., they are of the same general nature as -funroll-all-loops, causing global changes that ought to be only selectively applied.

But I don't really know what I'm talking about, so take it with a grain of salt. If one is interested in performance, I would suggest one should not use graphite without first thoroughly reading the documentation of the graphite flags in the gcc manual for the version of gcc in question.
http://gcc.gnu.org/onlinedocs/
_________________
Oldthinkers unbellyfeel INGSOC.
-- Headline of a document on Winston Smith's terminal in his cubicle at the Ministry of Truth, seen briefly in the background in one scene of the movie rendition of Nineteen Eighty-Four.
Back to top
View user's profile Send private message
Apheus
Apprentice
Apprentice


Joined: 12 Jul 2008
Posts: 182

PostPosted: Tue Mar 27, 2012 10:05 am    Post subject: Reply with quote

I use graphite globally on two machines for some weeks now (-fgraphite-identity, -floop-interchange, -floop-strip-mine, -floop-block), but did not do an "emerge -e world", so not all packages are recompiled yet. I do some firefox benchmarks from time to time (SunSpider, Kraken, V8, PeaceKeeper), but the numbers show nothing clear wrt CFLAGS: Upstream optimizations throughout the versions 9>10>11 seem to be more important for performance, and maybe USE=pgo.

I have excluded the most important system and toolchain packages from customized CFLAGS: libtool, glibc, gcc, coreutils, udev, openrc, sysvinit, binutils, bash, e2fsprogs. cloog-ppl is configured to build without graphite to workaround the chicken-egg problem when updating this. For grub and nvidia-drivers, I did not enable USE=custom-cflags. I noticed some screen update error with the grub default-entry countdown when built with custom CFLAGS.

Other problems so far are a few:

PyQt4 does not build
quake3 crashes when built with graphite as soon as a map is entered (~amd64 version, the stable version does not work at all)

gcc is the current stable amd64 version 4.5
Back to top
View user's profile Send private message
codestation
Tux's lil' helper
Tux's lil' helper


Joined: 09 Nov 2008
Posts: 126
Location: /dev/negi

PostPosted: Wed Mar 28, 2012 2:15 am    Post subject: Reply with quote

Since i got a new laptop, i did a clean install so all my packages have been compiled with the graphite flags since 6 months ago. I am using the current hardmasked gcc version (4.6.2) since the open bugs doesn't affect me.

This new laptop is more powerful than my old one so i don't have any performance/benchmark data, but in general i don't have problems. The only packages that failed me to compile with graphite flags are PyQt4, blender and postgresql-[base|server].
_________________
Just feel the code...
Back to top
View user's profile Send private message
BoneKracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1488
Location: U.S.A.

PostPosted: Wed Mar 28, 2012 3:34 am    Post subject: Reply with quote

I would very much like to see a scientifically-performed benchmarking analysis of the performance impact.
_________________
Oldthinkers unbellyfeel INGSOC.
-- Headline of a document on Winston Smith's terminal in his cubicle at the Ministry of Truth, seen briefly in the background in one scene of the movie rendition of Nineteen Eighty-Four.
Back to top
View user's profile Send private message
darklegion
Guru
Guru


Joined: 14 Nov 2004
Posts: 440

PostPosted: Sat Mar 31, 2012 8:26 am    Post subject: Reply with quote

Would be nice to see some benchmarks of LTO as well. Although as far as my experience goes, profile guided optimisation is the most useful and seems to yield around 5-10% performance boost with Wine and Dolphin. However, you can't enable this globally of course so not really useful for a full system.

-O3 or -Ofast can be useful for some programs too, but not a good idea at all to enable globally.
Back to top
View user's profile Send private message
Etal
Veteran
Veteran


Joined: 15 Jul 2005
Posts: 1633

PostPosted: Sat Mar 31, 2012 4:58 pm    Post subject: Reply with quote

If anything, with LTO you'll save a ton of space, especially with C++ applications - binaries can shrink by more than 50%.

I don't know about performance, though - never tested it. But I can't think of a way how LTO could cause it to decrease.
_________________
“And even in authoritarian countries, information networks are helping people discover new facts and making governments more accountable.”– Hillary Clinton, Jan. 21, 2010
Back to top
View user's profile Send private message
Yamakuzure
l33t
l33t


Joined: 21 Jun 2006
Posts: 951
Location: Bardowick, Germany

PostPosted: Wed Apr 11, 2012 4:36 pm    Post subject: Reply with quote

Apheus wrote:
I use graphite globally on two machines for some weeks now (-fgraphite-identity, -floop-interchange, -floop-strip-mine, -floop-block)
Well, those flags do not do much, you know. They just enable slight reorganization of nested loops to reduce cache misses. (*)
The power of graphite is revealed with "-ftree-loop-distribution", "-floop-parallelize-all" and "-ftree-parallelize-loops=<number_of_threads>". Those will strip loops, nested or not, apart if their iterations do not depend on each other and carry those parts out using threads.
...I once tried that globally.
...It produced a nice automatic "fork-bomb" halting my system after 5 to 10 minutes. :D

AFAIR there are a few (still?) packages that are not very happy about "-fgraphite-identity", but basically the mentioned four should be safe enough. Only the loop parallelization should not be used globally. And IMHO it is a bad idea to use them on libraries that are a) used by many libs/apps and b) do multi-threading on their own.

However, this site should give you a fair impression on the current state of Graphite: http://gcc.gnu.org/wiki/Graphite

(*): For the curious:
  • -fgraphite-identity
    Enable the identity transformation for graphite. For every SCoP we generate the polyhedral representation and transform it back to gimple. Using -fgraphite-identity we can check the costs or benefits of the GIMPLE -> GRAPHITE -> GIMPLE transformation. Some minimal optimizations are also performed by the code generator CLooG, like index splitting and dead code elimination in loops.
  • -floop-interchange
    Perform loop interchange transformations on loops. Interchanging two nested loops switches the inner and outer loops. For example, given a loop like:
    Code:
              DO J = 1, M
                DO I = 1, N
                  A(J, I) = A(J, I) * C
                ENDDO
              ENDDO
    loop interchange will transform the loop as if the user had written:
    Code:
              DO I = 1, N
                DO J = 1, M
                  A(J, I) = A(J, I) * C
                ENDDO
              ENDDO
    which can be beneficial when N is larger than the caches, because in Fortran, the elements of an array are stored in memory contiguously by column, and the original loop iterates over rows, potentially creating at each access a cache miss. This optimization applies to all the languages supported by GCC and is not limited to Fortran. To use this code transformation, GCC has to be configured with --with-ppl and --with-cloog to enable the Graphite loop transformation infrastructure.
  • -floop-strip-mine
    Perform loop strip mining transformations on loops. Strip mining splits a loop into two nested loops. The outer loop has strides equal to the strip size and the inner loop has strides of the original loop within a strip. The strip length can be changed using the loop-block-tile-size parameter. For example, given a loop like:
    Code:
              DO I = 1, N
                A(I) = A(I) + C
              ENDDO
    loop strip mining will transform the loop as if the user had written:
    Code:
              DO II = 1, N, 51
                DO I = II, min (II + 50, N)
                  A(I) = A(I) + C
                ENDDO
              ENDDO
    This optimization applies to all the languages supported by GCC and is not limited to Fortran. To use this code transformation, GCC has to be configured with --with-ppl and --with-cloog to enable the Graphite loop transformation infrastructure.
  • -floop-block
    Perform loop blocking transformations on loops. Blocking strip mines each loop in the loop nest such that the memory accesses of the element loops fit inside caches. The strip length can be changed using the loop-block-tile-size parameter. For example, given a loop like:
    Code:
              DO I = 1, N
                DO J = 1, M
                  A(J, I) = B(I) + C(J)
                ENDDO
              ENDDO
    loop blocking will transform the loop as if the user had written:
    Code:
              DO II = 1, N, 51
                DO JJ = 1, M, 51
                  DO I = II, min (II + 50, N)
                    DO J = JJ, min (JJ + 50, M)
                      A(J, I) = B(I) + C(J)
                    ENDDO
                  ENDDO
                ENDDO
              ENDDO
    which can be beneficial when M is larger than the caches, because the innermost loop will iterate over a smaller amount of data that can be kept in the caches. This optimization applies to all the languages supported by GCC and is not limited to Fortran. To use this code transformation, GCC has to be configured with --with-ppl and --with-cloog to enable the Graphite loop transformation infrastructure.
  • -ftree-loop-distribution
    Perform loop distribution. This flag can improve cache performance on big loop bodies and allow further loop optimizations, like parallelization or vectorization, to take place. For example, the loop
    Code:
              DO I = 1, N
                A(I) = B(I) + C
                D(I) = E(I) * F
              ENDDO
    is transformed to
    Code:
              DO I = 1, N
                 A(I) = B(I) + C
              ENDDO
              DO I = 1, N
                 D(I) = E(I) * F
              ENDDO
  • -floop-parallelize-all
    Use the Graphite data dependence analysis to identify loops that can be parallelized. Parallelize all the loops that can be analyzed to not contain loop carried dependences without checking that it is profitable to parallelize the loops.
  • -ftree-parallelize-loops=n
    Parallelize loops, i.e., split their iteration space to run in n threads. This is only possible for loops whose iterations are independent and can be arbitrarily reordered. The optimization is only profitable on multiprocessor machines, for loops that are CPU-intensive, rather than constrained e.g. by memory bandwidth. This option implies -pthread, and thus is only supported on targets that have support for -pthread.

_________________
I *do* know that I easily aggravate people due to my condensed writing. Rule of thumb: If I wrote anything that can be understood in two different ways, and one way offends you, then I meant the other! ;)
Back to top
View user's profile Send private message
Yamakuzure
l33t
l33t


Joined: 21 Jun 2006
Posts: 951
Location: Bardowick, Germany

PostPosted: Fri Apr 13, 2012 10:20 am    Post subject: Reply with quote

Etal wrote:
If anything, with LTO you'll save a ton of space, especially with C++ applications - binaries can shrink by more than 50%.

I don't know about performance, though - never tested it. But I can't think of a way how LTO could cause it to decrease.
Didn't see this earlier: If you are using LTO, you should be aware of the warning portage gives you after the merge of gcc:
gcc-ebuilds wrote:
* LTO support is still experimental and unstable
* Any bugs resulting from the use of LTO will not be fixed.

_________________
I *do* know that I easily aggravate people due to my condensed writing. Rule of thumb: If I wrote anything that can be understood in two different ways, and one way offends you, then I meant the other! ;)
Back to top
View user's profile Send private message
depontius
Advocate
Advocate


Joined: 05 May 2004
Posts: 2156

PostPosted: Fri Apr 13, 2012 12:28 pm    Post subject: Reply with quote

Pardon me please, but can someone give a simple definition of Graphite?

From what I can tell on this thread, it seems to be a separate set or class of gcc optimizations. It also sounds almost as if it's being separately developed, and then grafted on. I've done only a little searching, but haven't found a concise definition, just some low-level, gritty "you have to know the answer to get the answer" kind of stuff.
_________________
.sigs waste space and bandwidth
Back to top
View user's profile Send private message
Etal
Veteran
Veteran


Joined: 15 Jul 2005
Posts: 1633

PostPosted: Fri Apr 13, 2012 1:22 pm    Post subject: Reply with quote

Yamakuzure wrote:
Etal wrote:
If anything, with LTO you'll save a ton of space, especially with C++ applications - binaries can shrink by more than 50%.

I don't know about performance, though - never tested it. But I can't think of a way how LTO could cause it to decrease.
Didn't see this earlier: If you are using LTO, you should be aware of the warning portage gives you after the merge of gcc:
gcc-ebuilds wrote:
* LTO support is still experimental and unstable
* Any bugs resulting from the use of LTO will not be fixed.


I know, but I was responding to the poster above me ;)
_________________
“And even in authoritarian countries, information networks are helping people discover new facts and making governments more accountable.”– Hillary Clinton, Jan. 21, 2010
Back to top
View user's profile Send private message
Yamakuzure
l33t
l33t


Joined: 21 Jun 2006
Posts: 951
Location: Bardowick, Germany

PostPosted: Fri Apr 13, 2012 1:50 pm    Post subject: Reply with quote

depontius wrote:
Pardon me please, but can someone give a simple definition of Graphite?

From what I can tell on this thread, it seems to be a separate set or class of gcc optimizations. It also sounds almost as if it's being separately developed, and then grafted on. I've done only a little searching, but haven't found a concise definition, just some low-level, gritty "you have to know the answer to get the answer" kind of stuff.
The link I posted explains it pretty well.
gcc.gnu.org/wiki/Graphite wrote:
Graphite is a framework for high-level memory optimizations using the polyhedral model.
That says it all, unless you have no clue what "polyhedral model" means. The shortest explanation can be found in wikipedia:
http://en.wikipedia.org/wiki/Polyhedral_model wrote:
The polyhedral model (also called the polytope method) is a mathematical framework for loop nest optimization in program optimization. The polytope method treats each loop iteration within nested loops as lattice points inside mathematical objects called polytopes, performs affine transformations or more general non-affine transformations such as tiling on the polytopes, and then converts the transformed polytopes into equivalent, but optimized (depending on targeted optimization goal), loop nests through polyhedra scanning.
So in short: Graphite enables gcc to optimize loops in a memory friendly way and can (if wanted) split them to be iterated using parallel threads. This (hopefully) optimizes performance by a) parallelization and b) fewer cache misses.
_________________
I *do* know that I easily aggravate people due to my condensed writing. Rule of thumb: If I wrote anything that can be understood in two different ways, and one way offends you, then I meant the other! ;)
Back to top
View user's profile Send private message
depontius
Advocate
Advocate


Joined: 05 May 2004
Posts: 2156

PostPosted: Fri Apr 13, 2012 5:52 pm    Post subject: Reply with quote

Yamakuzure wrote:
depontius wrote:
Pardon me please, but can someone give a simple definition of Graphite?

From what I can tell on this thread, it seems to be a separate set or class of gcc optimizations. It also sounds almost as if it's being separately developed, and then grafted on. I've done only a little searching, but haven't found a concise definition, just some low-level, gritty "you have to know the answer to get the answer" kind of stuff.
The link I posted explains it pretty well.
gcc.gnu.org/wiki/Graphite wrote:
Graphite is a framework for high-level memory optimizations using the polyhedral model.
That says it all, unless you have no clue what "polyhedral model" means.

That says it all. This is the first time I've ever heard of the "polyhedral model." Of course I know what a polyhedron is. I've never formally taken graph theory, but I'm somewhat familiar with the concept of mapping things into edges, vertices, etc.
Yamakuzure wrote:
The shortest explanation can be found in wikipedia:
http://en.wikipedia.org/wiki/Polyhedral_model wrote:
The polyhedral model (also called the polytope method) is a mathematical framework for loop nest optimization in program optimization. The polytope method treats each loop iteration within nested loops as lattice points inside mathematical objects called polytopes, performs affine transformations or more general non-affine transformations such as tiling on the polytopes, and then converts the transformed polytopes into equivalent, but optimized (depending on targeted optimization goal), loop nests through polyhedra scanning.
So in short: Graphite enables gcc to optimize loops in a memory friendly way and can (if wanted) split them to be iterated using parallel threads. This (hopefully) optimizes performance by a) parallelization and b) fewer cache misses.

This is also the first time I've ever heard the word "polytope". I haven't hit your second link yet, but once I see the word "tiling" it sounds almost as if you're using the faces of the polyhedron as well as the edges and vertices.

At the very least, I have some interesting links to follow.
_________________
.sigs waste space and bandwidth
Back to top
View user's profile Send private message
Apheus
Apprentice
Apprentice


Joined: 12 Jul 2008
Posts: 182

PostPosted: Mon Jun 11, 2012 8:36 pm    Post subject: Reply with quote

I have found that rekonq (and konqueror with kwebkitpart) crash on most javascript-using websites if qt-webkit is compiled with the graphite flags.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum