Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Installing Gentoo
  • Search

PENTIUM 4 CFLAGS (UNITE ? )

Having problems with the Gentoo Handbook? If you're still working your way through it, or just need some info before you start your install, this is the place. All other questions go elsewhere.
Post Reply
Advanced search
33 posts
  • 1
  • 2
  • Next
Author
Message
pantoffel
n00b
n00b
Posts: 19
Joined: Thu Jan 30, 2003 6:54 pm

PENTIUM 4 CFLAGS (UNITE ? )

  • Quote

Post by pantoffel » Wed Sep 24, 2003 4:19 pm

Hi,


I'm a pentium4 user who still is doubting/ not satisfied what CFLAGS he should take for some*good* performance.You to?

I installed a while ago mandrake on the a frend his computer who has the exact same system specs as mine computer.His mandrake i586 was/is way faster then my gentoo system(their does the doubting starts). You can notice it the way gnome 2.4 starts , the way mozilla starts everything just feels verry smooth and fast)

Anyway, I searched trough the forums but I didn't realy find what I was looking for: A page where pentium4 users could post there cflags + sys specs. And there experience with these CFLAGS.Well I don't have much experience but here are mine:

ps: How come that the mandrake i586 runs faster then my gentoo ? :x


Code: Select all

sh-2.05b$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping        : 4
cpu MHz         : 2417.773
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 4823.44
 
sh-2.05b$

Code: Select all

gentoo:/home/pantoffel# hdparm -tT /dev/hda
 
/dev/hda:
 Timing buffer-cache reads:   1452 MB in  2.00 seconds = 726.00 MB/sec
 Timing buffered disk reads:  136 MB in  3.02 seconds =  45.03 MB/sec

Code: Select all

gentoo:/home/pantoffel# lspci -v
00:00.0 Host bridge: Silicon Integrated Systems [SiS] SiS645 Host & Memory & AGP Controller (rev 01)
        Flags: bus master, medium devsel, latency 32
        Memory at e0000000 (32-bit, non-prefetchable) [size=64M]
        Capabilities: [c0] AGP version 2.0
 
00:01.0 PCI bridge: Silicon Integrated Systems [SiS] SiS 530 Virtual PCI-to-PCI bridge (AGP) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 64
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        Memory behind bridge: dde00000-dfefffff
        Prefetchable memory behind bridge: cdb00000-ddcfffff
 
00:02.0 ISA bridge: Silicon Integrated Systems [SiS] 85C503/5513
        Flags: bus master, medium devsel, latency 0
 
00:02.2 USB Controller: Silicon Integrated Systems [SiS] SiS7001 USB Controller (rev 07) (prog-if 10 [OHCI])
        Subsystem: Micro-Star International Co., Ltd.: Unknown device 5470
        Flags: bus master, medium devsel, latency 64, IRQ 10
        Memory at dfffe000 (32-bit, non-prefetchable) [size=4K]
 
00:02.3 USB Controller: Silicon Integrated Systems [SiS] SiS7001 USB Controller (rev 07) (prog-if 10 [OHCI])
        Subsystem: Micro-Star International Co., Ltd.: Unknown device 5470
        Flags: bus master, medium devsel, latency 64, IRQ 11
        Memory at dffff000 (32-bit, non-prefetchable) [size=4K]
 
00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
        Subsystem: Micro-Star International Co., Ltd.: Unknown device 5470
        Flags: bus master, fast devsel, latency 128
        I/O ports at ff00 [size=16]
 
00:02.7 Multimedia audio controller: Silicon Integrated Systems [SiS] SiS7012 PCI Audio Accelerator (rev a0)
        Subsystem: Micro-Star International Co., Ltd.: Unknown device 5470
        Flags: bus master, medium devsel, latency 64, IRQ 10
        I/O ports at dc00 [size=256]
        I/O ports at d800 [size=128]
        Capabilities: [48] Power Management version 2
 
00:07.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RT8139
        Flags: bus master, medium devsel, latency 64, IRQ 10
        I/O ports at d400 [size=256]
        Memory at dfffdf00 (32-bit, non-prefetchable) [size=256]
        Capabilities: [50] Power Management version 2
 
01:00.0 VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX 440] (rev a3) (prog-if 00 [VGA])
        Subsystem: Micro-Star International Co., Ltd.: Unknown device 8771
        Flags: bus master, 66Mhz, medium devsel, latency 248, IRQ 11
        Memory at de000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d0000000 (32-bit, prefetchable) [size=128M]
        Memory at ddc80000 (32-bit, prefetchable) [size=512K]
        Expansion ROM at dfee0000 [disabled] [size=128K]
        Capabilities: [60] Power Management version 2
        Capabilities: [44] AGP version 2.0
 
gentoo:/home/pantoffel#





My make.conf

Code: Select all


# Copyright 2000-2003 Daniel Robbins, Gentoo Technologies, Inc.
# Contains local system settings for Portage system
# $Header: /home/cvsroot/gentoo-src/portage/cnf/make.conf,v 1.67 2003/08/21 01:01:26 carpaski Exp $

# Please review 'man make.conf' for more information.

# Build-time functionality
# ========================
#
# The USE variable is used to enable optional build-time functionality. For
# example, quite a few packages have optional X, gtk or GNOME functionality
# that can only be enabled or disabled at compile-time. Gentoo Linux has a
# very extensive set of USE variables described in our USE variable HOWTO at
# http://www.gentoo.org/doc/use-howto.html
#
# The available list of use flags with descriptions is in your portage tree.
# Use 'less' to view them:  --> less /usr/portage/profiles/use.desc <--
#
# 'ufed' is an ncurses/dialog interface available in portage to make handling
# useflags for you. 'emerge app-admin/ufed'
#
# Example:
#USE="X gtk -gnome alsa"
USE="acpi cdr cscope doc dv dvd emacs emacs-w3 evo faad fbcon flash gb gd gtk2 icc imagemagick ipv6 jack jikes lcms leim moznoxft pic sse tiff usb v4l videos oss -arts -kde -qt X alsa avi aalib bonobo crypt cups esd encode fbcon flash gif imlib jack java jikes jpeg lcms leim mad motif mozilla mpeg ncurses nls opengl oggvorbis pam pdflib png perl python readline sdl spell svga tcltk tcpd tetex truetype usb xml2 xmms xv zlib mozsvg sse gnome"

# Host Setting
# ============
#
# If you are using a Pentium Pro or greater processor, leave this line as-is;
# otherwise, change to i586, i486 or i386 as appropriate. All modern systems
# (even Athlons) should use "i686-pc-linux-gnu". All K6's are i586.
#
CHOST="i686-pc-linux-gnu"

# Host and optimization settings 
# ==============================
#
# For optimal performance, enable a CFLAGS setting appropriate for your CPU.
#
# Please note that if you experience strange issues with a package, it may be
# due to gcc's optimizations interacting in a strange way. Please test the
# package (and in some cases the libraries it uses) at default optimizations
# before reporting errors to developers.
#
# -mcpu=<cpu-type> means optimize code for the particular type of CPU without
# breaking compatibility with other CPUs.
#
# -march=<cpu-type> means to take full advantage of the ABI and instructions
# for the particular CPU; this will break compatibility with older CPUs (for
# example, -march=athlon-xp code will not run on a regular Athlon, and
# -march=i686 code will not run on a Pentium Classic.
#
# CPU types supported in gcc-3.2 and higher: athlon-xp, athlon-mp,
# athlon-tbird, athlon, k6, k6-2, k6-3, i386, i486, i586 (Pentium), i686
# (PentiumPro), pentium, pentium-mmx, pentiumpro, pentium2 (Celeron), pentium3.
# Note that Gentoo Linux 1.4 and higher include at least gcc-3.2.
# 
# CPU types supported in gcc-2.95*: k6, i386, i486, i586 (Pentium), i686
# (Pentium Pro), pentium, pentiumpro Gentoo Linux 1.2 and below use gcc-2.95*
#
# CRITICAL WARNINGS: ****************************************************** #
# ATHLON-4 will generate invalid SSE  instructions; use 'athlon'   instead. #
# PENTIUM4 will generate invalid SSE2 instructions; use 'pentium3' instead. #
# K6 markings are deceptive. Avoid setting -march for them. See Bug #24379. #
# ************************************************************************* #
#
# Decent examples:
#
#CFLAGS="-mcpu=athlon-xp -O3 -pipe"

# This is what I Used to compile the bootstrap
# These CFLAGS are used on the p4 livecd. 
# (only changed from -O3 to   -O2)

CFLAGS="-O2 -mcpu=i686 -funroll-loops -pipe"


# This is what I'm using after the bootstrap!
# I'm running ~x86 , so i got gcc-3.3.1, 
# this version should fix the invalid SSE2 instructions.

#CFLAGS="-march=pentium4 -mmmx -msse -msse2 -mfpmath=sse -O2 -fomit-frame-pointer -pipe"


# If you set a CFLAGS above, then this line will set your default C++ flags to
# the same settings.
CXXFLAGS="${CFLAGS}"

# Advanced Masking
# ================
#
# Gentoo is using a new masking system to allow for easier stability testing
# on packages. KEYWORDS are used in ebuilds to mask and unmask packages based
# on the platform they are set for. A special form has been added that
# indicates packages and revisions that are expected to work, but have not yet
# been approved for the stable set. '~arch' is a superset of 'arch' which
# includes the unstable, in testing, packages. Users of the 'x86' architecture
# would add '~x86' to ACCEPT_KEYWORDS to enable unstable/testing packages.
# '~ppc', '~sparc' are the unstable KEYWORDS for their respective platforms.
# DO NOT PUT ANYTHING BUT YOUR SPECIFIC ~ARCHITECTURE IN THE LIST.
# IF YOU ARE UNSURE OF YOUR ARCH, OR THE IMPLICATIONS, DO NOT MODIFY THIS.
#
ACCEPT_KEYWORDS="~x86"

# Portage Directories
# ===================
#
# Each of these settings controls an aspect of portage's storage and file
# system usage. If you change any of these, be sure it is available when
# you try to use portage. *** DO NOT INCLUDE A TRAILING "/" ***
#
# PORTAGE_TMPDIR is the location portage will use for compilations and
#     temporary storage of data. This can get VERY large depending upon
#     the application being installed.
#PORTAGE_TMPDIR=/var/tmp
#
# PORTDIR is the location of the portage tree. This is the repository
#     for all profile information as well as all ebuilds. This directory
#     itself can reach 200M. WE DO NOT RECOMMEND that you change this.
#PORTDIR=/usr/portage
#
# DISTDIR is where all of the source code tarballs will be placed for
#     emerges. The source code is maintained here unless you delete
#     it. The entire repository of tarballs for gentoo is 9G. This is
#     considerably more than any user will ever download. 2-3G is
#     a large DISTDIR.
#DISTDIR=${PORTDIR}/distfiles
#
# PKGDIR is the location of binary packages that you can have created
#     with '--buildpkg' or '-b' while emerging a package. This can get
#     upto several hundred megs, or even a few gigs.
#PKGDIR=${PORTDIR}/packages
#
# PORT_LOGDIR is the location where portage will store all the logs it
#     creates from each individual merge. They are stored as YYMMDD-$PF.log
#     in the directory specified. This is disabled until you enable it by
#     providing a directory. Permissions will be modified as needed IF the
#     directory exists, otherwise logging will be disabled.
#PORT_LOGDIR=/var/log/portage
#
# PORTDIR_OVERLAY is a directory where local ebuilds may be stored without
#     concern that they will be deleted by rsync updates. Default is not
#     defined.
PORTDIR_OVERLAY=/usr/local/portage

# Fetching files 
# ==============
#
# If you need to set a proxy for wget or lukemftp, add the appropriate "export
# ftp_proxy=<proxy>" and "export http_proxy=<proxy>" lines to /etc/profile if
# all users on your system should use them.
#
# Portage uses wget by default. Here are some settings for some alternate
# downloaders -- note that you need to merge these programs first before they
# will be available.
#
# Default fetch command (5 tries, passive ftp for firewall compatibility)
#FETCHCOMMAND="/usr/bin/wget -t 5 --passive-ftp \${URI} -P \${DISTDIR}"
#RESUMECOMMAND="/usr/bin/wget -c -t 5 --passive-ftp \${URI} -P \${DISTDIR}"
#
# Using wget, ratelimiting downloads
#FETCHCOMMAND="/usr/bin/wget -t 5 --passive-ftp --limit-rate=200k \${URI} -P \${DISTDIR}"
#RESUMECOMMAND="/usr/bin/wget -c -t 5 --passive-ftp --limit-rate=200k \${URI} -P \${DISTDIR}"
#
# Lukemftp (BSD ftp):
#FETCHCOMMAND="/usr/bin/lukemftp -s -a -o \${DISTDIR}/\${FILE} \${URI}"
#RESUMECOMMAND="/usr/bin/lukemftp -s -a -R -o \${DISTDIR}/\${FILE} \${URI}"
#
# Prozilla (turbo downloader)
#FETCHCOMMAND='/usr/bin/proz --no-getch -s ${URI} -P ${DISTDIR}'
#
# Portage uses GENTOO_MIRRORS to specify mirrors to use for source retrieval.
# The list is a space seperated list which is read left to right. If you use
# another mirror we highly recommend leaving the default mirror at the end of
# the list so that portage will fall back to it if the files cannot be found
# on your specified mirror. We _HIGHLY_ recommend that you change this setting
# to a nearby mirror by merging and using the 'mirrorselect' tool.
#GENTOO_MIRRORS="<your_mirror_here> http://gentoo.oregonstate.edu http://www.ibiblio.org/pub/Linux/distributions/gentoo"
#
# Portage uses PORTAGE_BINHOST to specify mirrors for prebuilt-binary packages.
# The list is a single extry specifying the full address of the directory
# serving the tbz2's for your system. Running emerge with either '--getbinpkg'
# or '--getbinpkgonly' will cause portage to retrieve the metadata from all
# packages in the directory specified, and use that data to determine what will
# be downloaded and merged. '-g' or '-gK' are the recommend parameters. Please
# consult the man pages and 'emerge --help' for more information.
#PORTAGE_BINHOST="ftp://login:pass@grp.mirror.site/pub/grp/i686/athlon-xp/"
#PORTAGE_BINHOST="http://grp.mirror.site/gentoo/grp/1.4/i686/athlon-xp/"

# Synchronizing Portage
# =====================
#
# Each of these settings effects how Gentoo synchronizes your Portage tree.
# Synchronization is handled by rsync and these settings allow some control
# over how it is done.
#
#
# SYNC is the server used by rsync to retrieve a localized rsync mirror
#     rotation. This allows you to select servers that are geographically
#     close to you, yet still distribute the load over a number of servers.
#     Please do not single out specific rsync mirrors. Doing so places undue
#     stress on particular mirrors.  Instead you may use one of the following
#     continent specific rotations:
#
#   Default:       "rsync://rsync.gentoo.org/gentoo-portage"
#   North America: "rsync://rsync.namerica.gentoo.org/gentoo-portage"
#   South America: "rsync://rsync.samerica.gentoo.org/gentoo-portage"
#   Europe:        "rsync://rsync.europe.gentoo.org/gentoo-portage"
#   Asia:          "rsync://rsync.asia.gentoo.org/gentoo-portage"
#   Australia:     "rsync://rsync.au.gentoo.org/gentoo-portage"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
#
# RSYNC_RETRIES sets the number of times portage will attempt to retrieve
#     a current portage tree before it exits with an error. This allows
#     for a more successful retrieval without user intervention most times.
#RSYNC_RETRIES="3"
#
# RSYNC_TIMEOUT sets the length of time rsync will wait before it times out
#     on a connection. Most users will benefit from this setting as it will
#     reduce the amount of 'dead air' they experience when they run across
#     the occasional, unreachable mirror. Dialup users might want to set this
#     value up around the 300 second mark.
#RSYNC_TIMEOUT=180

# Advanced Features
# =================
#
# MAKEOPTS provides extra options that may be passed to 'make' when a
#     program is compiled. Presently the only use is for specifying
#     the number of parallel makes (-j) to perform. The suggested number
#     for parallel makes is CPUs+1.
#MAKEOPTS="-j2"
#
# PORTAGE_NICENESS provides a default increment to emerge's niceness level.
#     Note: This is an increment. Running emerge in a niced environment will
#     reduce it further. Default is unset.
#PORTAGE_NICENESS=3
#
# AUTOCLEAN enables portage to automatically clean out older or overlapping
#     packages from the system after every successful merge. This is the
#     same as running 'emerge -c' after every merge. Set with: "yes" or "no".
#     This does not affect the unpacked source. See 'noclean' below.
#AUTOCLEAN="yes"
#
# FEATURES are settings that affect the functionality of portage. Most of
#     these settings are for developer use, but some are available to non-
#     developers as well. 
#
#  'autoaddcvs'  causes portage to automatically try to add files to cvs
#                that will have to be added later. Done at generation times
#                and only has an effect when 'cvs' is also set.
#  'buildpkg'    causes binary packages to be created of all packages that 
#                are merged.
#  'ccache'      enables ccache support via CC.
#  'cvs'         feature for developers that causes portage to enable all
#                cvs features (commits, adds) and all USE flags in SRC_URI
#                will be applied for digests.
#  'digest'      autogenerate a digest for packages.
#  'distcc'      enables distcc support via CC.
#  'fixpackages' allows portage to fix binary packages that are stored in
#                PKGDIR. This can consume a lot of time. 'fixpackages' is
#                also a script that can be run at any given time to force
#                the same actions.
#  'keeptemp'    prevents the clean phase from deleting the temp files ($T) 
#                from a merge.
#  'keepwork'    prevents the clean phase from deleting the WORKDIR.
#  'noauto'      causes ebuild to perform only the action requested and 
#                not any other required actions like clean or
#  'noclean'     prevents portage from removing the source and temporary files 
#                after a merge -- for debugging purposes only. 
#  'nostrip'     prevents stripping of binaries.
#  'notitles'    disables xterm titlebar updates (which contain status info). 
#  'sandbox'     enable sandbox-ing when running emerge and ebuild
#  'strict'      causes portage to react strongly to conditions that
#                have the potential to be dangerous -- like missing or
#                incorrect Manifest files.
#  'userpriv'    allows portage to drop root privleges while it is compiling
#                as a security measure, and as a side effect this can remove 
#                sandbox access violations for users. 
#  'usersandbox' enables sandboxing while portage is running under userpriv.
#                unpack -- for debugging purposes only.
#FEATURES="sandbox buildpkg ccache distcc userpriv usersandbox notitles noclean noauto cvs keeptemp keepwork"
#
# CCACHE_SIZE sets the space use limitations for ccache. The default size is
#     2G, and will be set if not defined otherwise and ccache is in features. 
#     Portage will set the default ccache dir if it is not present in the
#     user's environment, for userpriv it sets: ${PORTAGE_TMPDIR}/ccache
#     (/var/tmp/ccache), and for regular use the default is /root/.ccache.
#     Sizes are specified with 'G' 'M' or 'K'.
#     '4G' for 4 gigabytes, '4096M' for 4 gigabytes, etc... Default is 2G
#CCACHE_SIZE="2G"
#
# RSYNC_EXCLUDEFROM is a file that portage will pass to rsync when it updates
#     the portage tree. Specific chucks of the tree may be excluded from
#     consideration. This may cause dependency failures if you are not careful.
#     The file format is one pattern per line, blanks and ';' or '#' lines are
#     comments. See 'man rsync' for more details on the exclude-from format.
#RSYNC_EXCLUDEFROM=/etc/portage/rsync_excludes
Top
pkxl2
n00b
n00b
User avatar
Posts: 15
Joined: Tue Nov 04, 2003 6:30 pm
Location: Germany
Contact:
Contact pkxl2
Website

  • Quote

Post by pkxl2 » Wed Nov 26, 2003 4:05 pm

How about

Code: Select all

CFLAGS="-O3 -march=pentium4 -pipe" 
will build your shit optimized for your processor...(worked fine for me)
[/code]
AeonFlux - AMD 1800+XP 512 RAM - gs-2.4.21
Shakira - Sony Vaio P4M 1800 512 RAM - 2.6.6-rc3-love
bong - AMD 900 768 RAM - 2.4.26-rc1
wau - Alpha MIATA PW600AU 1,5GB RAM - 2.4.26-rc1
Top
dabooty
Guru
Guru
User avatar
Posts: 482
Joined: Thu May 15, 2003 3:21 pm
Location: Belgium
Contact:
Contact dabooty
Website

  • Quote

Post by dabooty » Wed Nov 26, 2003 6:16 pm

Why are you using -mcpu=i686 ?
it builds for i686 without breaking compatibility. I'm pretty sure you don't need that compatibility (unless you want to share binaries with friends).
You should definitely use -march=....

Depending on your gcc version you should be using pentium3 or pentium4
gcc3.2 had a bug in -march=pentium4 you for that gcc you should put -march=pentium4

if you've already upgraded to gcc3.3 you can safely use -march=pentium4
registered user #284425
get yourself counted
http://counter.li.org
------
#emerge -pv solves a lot of questions beforehand
Top
s3ntient
Guru
Guru
User avatar
Posts: 304
Joined: Sun Apr 13, 2003 9:24 pm
Location: Lyon, France
Contact:
Contact s3ntient
Website

  • Quote

Post by s3ntient » Wed Nov 26, 2003 6:40 pm

Here's my CLAGS:

Code: Select all

CFLAGS="-march=pentium4 -O2 -funroll-loops -pipe -mfpmath=sse,387 -ffast-math -msse2 -mmmx -fomit-frame-pointer"
Works like a charm :)
http://blog.chaostrophy.org
Top
Jazz
Guru
Guru
User avatar
Posts: 543
Joined: Sun Nov 16, 2003 10:50 pm
Location: Melbourne, Australia

  • Quote

Post by Jazz » Wed Nov 26, 2003 8:29 pm

MAN !! i'm soo glad i read this topic !!

Even i had some hard time deciding flags ! and so as i read in some thread.. simple is better.. i just use the minimal flags !

i use "-march=pentium4 -Os -pipe -msse -msse2 -mmmx"

I used the -fomit-frame-pointer flag earlier but found out that it increased the size of my binaries !! for EG.. when compiled with fomit... my Abiword-2.0.1 binary size was 4.2 MB ! but by removing that flag it reduced to 3.5 MB !! now thats a whopping chanege ! isnt it ???

So, even i got a p4 2.4 (533) with 512 DDR (333), any one wud like to gimme some suggestions ?? also i'm gonna reinstall everyting in a week right form stage 1 and i plan to do it with gcc 3.3 right now i'm using gcc-3.2

Cud some one guide me with the flags to be used while bootstrap and while emerging the normal system ?

ALSO, i wud liuke to know what the flags " -mfpmath=sse,387 -ffast-math" actually do and why r they needed.. also do they increase the binary size significantly ?

Please help..

Bye,
Jassi
Top
fidler
Apprentice
Apprentice
User avatar
Posts: 162
Joined: Wed Jul 03, 2002 6:55 pm
Location: Utah

  • Quote

Post by fidler » Wed Nov 26, 2003 8:36 pm

ALSO, i wud liuke to know what the flags " -mfpmath=sse,387 -ffast-math" actually do and why r they needed.. also do they increase the binary size significantly ?
They force the programs to use the SSE 387 math coprosser.

Anyhow, I found that the kernel chosen actually makes the biggest difference in performance, personally. I use the GRP kernel.
Top
Cossins
Veteran
Veteran
User avatar
Posts: 1135
Joined: Fri Mar 21, 2003 4:03 pm
Location: Copenhagen, Denmark
Contact:
Contact Cossins
Website

  • Quote

Post by Cossins » Wed Nov 26, 2003 8:55 pm

Do NOT use -mfpmath=see,387! It is beta code, and has much poorer performance than simpy -mfpmath=sse.

My CFLAGS:

Code: Select all

-O2 -march=pentium4 -mcpu=pentium4 -pipe -fomit-frame-pointer -msse -msse2 -mmmx -mfpmath=sse -fPIC
-ffast-math causes calculations to be less precise (bad for most things), and the performance gain is minimal. I have considered trying -Os, some people have reported better performance with that than -O2. I used to use -O3, but I realised that the extra performance gained was minimal compared to the compile time spared with -O2.

BTW, it's great to see a CFLAGS thread for Pentium 4. The CFLAGS Central has become very hard to find your way round, being >20 pages long, and it is also more fair to have separate threads for each processor. So please don't mark this as a dupe.

- Simon
Top
dabooty
Guru
Guru
User avatar
Posts: 482
Joined: Thu May 15, 2003 3:21 pm
Location: Belgium
Contact:
Contact dabooty
Website

  • Quote

Post by dabooty » Wed Nov 26, 2003 9:22 pm

Cossins wrote: My CFLAGS:

Code: Select all

-O2 -march=pentium4 -mcpu=pentium4 -pipe -fomit-frame-pointer -msse -msse2 -mmmx -mfpmath=sse -fPIC
Correct me if I'm wrong, but I thought -fPIC was only for libraries, and it's something the ebuild/makefile should add when appropriate
registered user #284425
get yourself counted
http://counter.li.org
------
#emerge -pv solves a lot of questions beforehand
Top
s3ntient
Guru
Guru
User avatar
Posts: 304
Joined: Sun Apr 13, 2003 9:24 pm
Location: Lyon, France
Contact:
Contact s3ntient
Website

  • Quote

Post by s3ntient » Wed Nov 26, 2003 9:58 pm

Cossins wrote: -ffast-math causes calculations to be less precise (bad for most things), and the performance gain is minimal.
I read somwhere on this forum that the impact of having less precision in calculations only really affected scientific applications ehich required high precision. Is this not so?
http://blog.chaostrophy.org
Top
s3ntient
Guru
Guru
User avatar
Posts: 304
Joined: Sun Apr 13, 2003 9:24 pm
Location: Lyon, France
Contact:
Contact s3ntient
Website

  • Quote

Post by s3ntient » Wed Nov 26, 2003 10:50 pm

-mfpmath=unit
generate floating point arithmetics for selected unit unit. the choices for unit are:
387
Use the standard 387 floating point coprocessor present majority of chips and emulated otherwise. Code compiled with this option will run almost everywhere. The temporary results are computed in 80bit precesion instead of precision specified by the type resulting in slightly different results compared to most of other chips. See -ffloat-store for more detailed description.

This is the default choice for i386 compiler.

sse
Use scalar floating point instructions present in the SSE instruction set. This instruction set is supported by Pentium3 and newer chips, in the AMD line by Athlon-4, Athlon-xp and Athlon-mp chips. The earlier version of SSE instruction set supports only single precision arithmetics, thus the double and extended precision arithmetics is still done using 387. Later version, present only in Pentium4 and the future AMD x86-64 chips supports double precision arithmetics too.

For i387 you need to use -march=cpu-type, -msse or -msse2 switches to enable SSE extensions and make this option effective. For x86-64 compiler, these extensions are enabled by default.


The resulting code should be considerably faster in majority of cases and avoid the numerical instability problems of 387 code, but may break some existing code that expects temporaries to be 80bit.


This is the default choice for x86-64 compiler.

sse,387
Attempt to utilize both instruction sets at once. This effectivly double the amount of available registers and on chips with separate execution units for 387 and SSE the execution resources too. Use this option with care, as it is still experimental, because gcc register allocator does not model separate functional units well resulting in instable performance.
(From the GCC manual)

I've had no problems with it...
I highly recommend to people who wish to know a lot more about CFLAGS to have a read through this thread but be warned it is long (25 pages) yet a very informative read...
http://forums.gentoo.org/viewtopic.php?t=5717
http://blog.chaostrophy.org
Top
Hypnos
Advocate
Advocate
User avatar
Posts: 2889
Joined: Thu Jul 18, 2002 5:12 pm
Location: Omnipresent

  • Quote

Post by Hypnos » Tue Jan 13, 2004 7:01 am

These are my flags for my P4:

Code: Select all

CFLAGS="-pipe -O2 -fomit-frame-pointer -mcpu=pentium4 -march=pentium4 -mfpmath=sse"
* -O2 is a happy medium between compilation time, performance and compilation reliability. -O3 is a little too hardcore for my tastes, and might even break code if the compiler trips on a bug.

* I include -fomit-frame-pointer because it's tough to debug stripped code anyway (Portage strips all binaries by default), and this gives a small performance boost to most apps.

* I include -mcpu=pentium4 even though it's redundant to -march=pentium4 because the latter is often filtered out in sensitive ebuilds. -mcpu=pentium4 gives a nice performance boost in desktop responsiveness due to better scheduling, and -march=pentium4 (which implies SSE, SSE2 and MMX) gives a small boost in certain situations (but it can cause compilation errors with some code).

* -mfpmath=sse should make scalar floating point math much, much faster according to the GCC docs, but people disagree. In any case, it's harmless.

Here's an interesting source:

An Evolutionary Analysis of GNU C Optimizations: Using Natural Selection to Investigate Software Complexities
http://www.coyotegulch.com/acovea/

The testing was done using specific algorithms that had scheduling priority. Compare with the GCC docs.
Personal overlay | Simple backup scheme
Top
taskara
Advocate
Advocate
Posts: 3762
Joined: Wed Apr 10, 2002 11:38 pm
Location: Australia

  • Quote

Post by taskara » Fri Apr 30, 2004 2:08 am

I'm just about to re-build my system with 2004.1 stage1, and was thinking of the following settings:

-march=pentium4 -O2 -pipe -fomit-frame-pointer -mfpmath=sse

on a gcc 3.4 system with nptl and 2.6 headers and reiser4

I don't bother with -msse, -msse2 and -mmmx etc cause they are implied by -march=pentium4
Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer!
Top
genix
n00b
n00b
Posts: 28
Joined: Tue Mar 02, 2004 9:41 am

  • Quote

Post by genix » Fri Apr 30, 2004 4:23 am

Here are my CFLAGS... sticking to it after playing are with various settings and this is the best in my opinion... fast binaries that are optimised without extra code size... works fast, loads fast and takes up less ram than the CFLAGS mentioned above in other ppls' CFLAGS....

-O3 -march=pentium4 -fmove-all-movables -fforce-addr -ffast-math -fprefetch-loop-arrays -ftracer -fomit-frame-pointer -mno-align-string-ops -mno-push-args -pipe -w

derived the above CFLAGS after much experimentation, using ACOVEA and consultation wif other gentooers.... -ffast-math really speeds things up but breaks compatability wif some standards thingy...dun use if u build rockets that are fired into space by NASA...else its is fine.... -funroll-loops do not always work and the gain in minimal so dun bother.... try my CFLAGS and leave some comments here... I am a happy user of imy CFLAGS and all proggies compile wif no probs on it... so try it...
Top
Hypnos
Advocate
Advocate
User avatar
Posts: 2889
Joined: Thu Jul 18, 2002 5:12 pm
Location: Omnipresent

  • Quote

Post by Hypnos » Fri Apr 30, 2004 5:04 am

http://forums.gentoo.org/viewtopic.php?t=157108
Personal overlay | Simple backup scheme
Top
taskara
Advocate
Advocate
Posts: 3762
Joined: Wed Apr 10, 2002 11:38 pm
Location: Australia

  • Quote

Post by taskara » Fri Apr 30, 2004 5:26 am

Hypnos wrote:http://forums.gentoo.org/viewtopic.php?t=157108
how does Acovea work?
Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer!
Top
Hypnos
Advocate
Advocate
User avatar
Posts: 2889
Joined: Thu Jul 18, 2002 5:12 pm
Location: Omnipresent

  • Quote

Post by Hypnos » Fri Apr 30, 2004 5:29 am

taskara wrote:
Hypnos wrote:http://forums.gentoo.org/viewtopic.php?t=157108
how does Acovea work?
It's an evolutionary simulator, taking the CFLAGS as genetic traits.
Personal overlay | Simple backup scheme
Top
taskara
Advocate
Advocate
Posts: 3762
Joined: Wed Apr 10, 2002 11:38 pm
Location: Australia

  • Quote

Post by taskara » Fri Apr 30, 2004 5:31 am

ic.. yes I read a few articles on it.. cheers
Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer!
Top
mudrii
l33t
l33t
Posts: 789
Joined: Thu Jun 26, 2003 12:27 am
Location: Singapore
Contact:
Contact mudrii
Website

  • Quote

Post by mudrii » Fri Apr 30, 2004 5:32 am

vary simple compile it and run it and you have some documentation
www.gentoo.ro
Top
Angel666
n00b
n00b
User avatar
Posts: 45
Joined: Fri Nov 14, 2003 2:21 am
Location: Palo Alto, CA
Contact:
Contact Angel666
Website

  • Quote

Post by Angel666 » Sun May 02, 2004 7:40 pm

I just want to add my own CFLAGS for GCC 3.4:

Code: Select all

CFLAGS="-O2 -Wall -pipe -march=pentium4 -fno-if-conversion2 -finline-functions -funit-at-a-time -fpeel-loops -mno-push-args -maccumulate-outgoing-args -falign-functions -finline-limit=600 -fno-crossjumping -fno-omit-frame-pointer -minline-all-stringops -ftracer -funswitch-loops -fmove-all-movables -fno-defer-pop -frename-registers -fno-delayed-branch -freduce-all-givs"
Yes, they look massive, but i used acovea and the really have sped up my system noticeably!
"One World, One web, One program" - Microsoft Promo ad.
"Ein Volk, Ein Reich, Ein Fuhrer" - Adolf Hitler
Top
taskara
Advocate
Advocate
Posts: 3762
Joined: Wed Apr 10, 2002 11:38 pm
Location: Australia

  • Quote

Post by taskara » Mon May 03, 2004 1:02 am

Angel666 wrote:I just want to add my own CFLAGS for GCC 3.4:

Code: Select all

CFLAGS="-O2 -Wall -pipe -march=pentium4 -fno-if-conversion2 -finline-functions -funit-at-a-time -fpeel-loops -mno-push-args -maccumulate-outgoing-args -falign-functions -finline-limit=600 -fno-crossjumping -fno-omit-frame-pointer -minline-all-stringops -ftracer -funswitch-loops -fmove-all-movables -fno-defer-pop -frename-registers -fno-delayed-branch -freduce-all-givs"
Yes, they look massive, but i used acovea and the really have sped up my system noticeably!
what does it break?!

I can't even get glibc to compile under gcc 3.4 when bootstrapping with -march=pentium4 -O2 -pipe!
Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer!
Top
Rcomian
Apprentice
Apprentice
User avatar
Posts: 174
Joined: Sat Jan 10, 2004 11:20 pm
Location: Uk, Northwest

It's not all about the CFLAGS

  • Quote

Post by Rcomian » Mon May 03, 2004 11:47 am

With all this talk about CFLAGS to improve binary speed, you also get a massive speed increase when starting programs if you run prelink on your system. You'll need to emerge it first, but it's simple to use.
I believe I read somewhere that some distros use prelinking as standard, now I can't find that quote, and it wasn't authoritative anyway, but may explain why gnome starts faster than on your system.
Also, make sure your chipset is compiled into your kernel. I had trouble getting my machine to remember to use DMA mode for my drives until I found and compiled support for my chipset into the kernel, then it's all gone sweet.
Top
Hypnos
Advocate
Advocate
User avatar
Posts: 2889
Joined: Thu Jul 18, 2002 5:12 pm
Location: Omnipresent

  • Quote

Post by Hypnos » Mon May 03, 2004 8:04 pm

taskara wrote:what does it break?!

I can't even get glibc to compile under gcc 3.4 when bootstrapping with -march=pentium4 -O2 -pipe!
So far, the Acovea-determined CFLAGS seem very stable. I just compiled glibc with them with gcc-3.3.3.
Personal overlay | Simple backup scheme
Top
taskara
Advocate
Advocate
Posts: 3762
Joined: Wed Apr 10, 2002 11:38 pm
Location: Australia

  • Quote

Post by taskara » Mon May 03, 2004 10:42 pm

ahh cool..

u should run it against gcc 3.4 and see what flags you get ;)

will those flags work best with any pentium4 machine? or is it specific to your machine for some reason?
Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer!
Top
Hypnos
Advocate
Advocate
User avatar
Posts: 2889
Joined: Thu Jul 18, 2002 5:12 pm
Location: Omnipresent

  • Quote

Post by Hypnos » Mon May 03, 2004 11:35 pm

taskara wrote:ahh cool..

u should run it against gcc 3.4 and see what flags you get ;)
Probably will, when it goes stable.
will those flags work best with any pentium4 machine? or is it specific to your machine for some reason?
It's probably specific to your CPU core, cache size and bus speed.
Personal overlay | Simple backup scheme
Top
Angel666
n00b
n00b
User avatar
Posts: 45
Joined: Fri Nov 14, 2003 2:21 am
Location: Palo Alto, CA
Contact:
Contact Angel666
Website

  • Quote

Post by Angel666 » Wed May 05, 2004 3:10 am

taskara wrote:what does it break?!

I can't even get glibc to compile under gcc 3.4 when bootstrapping with -march=pentium4 -O2 -pipe!
Actually, so far, out of about 660 packages (all that i have installed) its only broken a few (i.e < 5).

The speed boost, with GCC 3.4 + the CFLAGS is very noticeable, apps start faster and are a lot more responsive.

The downside is that it took about 3 days of not touching the computer to find these optimizations :P
"One World, One web, One program" - Microsoft Promo ad.
"Ein Volk, Ein Reich, Ein Fuhrer" - Adolf Hitler
Top
Post Reply

33 posts
  • 1
  • 2
  • Next

Return to “Installing Gentoo”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic