Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Kernel & Hardware
  • Search

amd64 binary distributions optimized for 2003-era processor?

Kernel not recognizing your hardware? Problems with power management or PCMCIA? What hardware is compatible with Gentoo? See here. (Only for kernels supported by Gentoo.)
Post Reply
Advanced search
22 posts • Page 1 of 1
Author
Message
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

amd64 binary distributions optimized for 2003-era processor?

  • Quote

Post by dE_logics » Sat Mar 29, 2025 8:30 am

I was under the impression that the x86 extensions feature allowed the binary to execute 'substitute instructions' in case certain instructions where not found in the running processor. However this does not seem to be true. This essentially implies that any binary OS which advertises support for the 2003 k8 processor, contains instructions limited to the k8 processor unless the programmer codes for gcc's multiversioning feature.

To see how much does gcc uses these 'new' instructions --

Code: Select all

/usr/x86_64-mypl-linux-gnu/usr/bin/cat --help
Usage: /usr/x86_64-mypl-linux-gnu/usr/bin/cat [OPTION]... [FILE]...
Concatenate FILE(s) to standard output.

With no FILE, or when FILE is -, read standard input.

  -A, --show-all           equivalent to -vET
  -b, --number-nonblank    number nonempty output lines, overrides -n
  -e                       equivalent to -vE
  -E, --show-ends          display $ at end of each line
  -n, --number             number all output lines
  -s, --squeeze-blank      suppress repeated empty output lines
  -t                       equivalent to -vT
  -T, --show-tabs          display TAB characters as ^I
  -u                       (ignored)
  -v, --show-nonprinting   use ^ and M- notation, except for LFD and TAB
      --help        display this help and exit
      --version     output version information and exit

Examples:
  /usr/x86_64-mypl-linux-gnu/usr/bin/cat f - g  Output f's contents, then standard input, then g's contents.
  /usr/x86_64-mypl-linux-gnu/usr/bin/cat        Copy standard input to standard output.
Illegal instruction (core dumped)
I can't even chroot into this install. It almost feels like an ARM machine.

So it is true, that 99.9% of prebuilt binary applications (even the kernel) are NOT using a processor's new instructions to maintain compatibility? And so if you buy a new processors, it's performance gains in 99.9% of the cases (when using prebuilt binaries) are limited to how fast legacy instructions are executed?

For the same reason I was wonder why this benchmark works with the same binaries with avx512 disabled.
My blog
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Sun Mar 30, 2025 1:04 am

I don't think that was ever the case that cpus would emulate new instructions, though all modern CPUs trap on invalid instructions. However these traps are extremely expensive and the kernel may or may not be able to handle translation or not. So yes you will need to get a properly compiled kernel and software appropriate for the CPU. And yeah that's why clock speed has always been king since a lot of software writers will target to make sure as many people can run the software as possible.

I do have a few base amd64 CPUs (AMD K8; Intel P4). Because of this I tend to build all my binaries base amd64 just so I can shift binaries between machines at a whim.

The "x86_64_v3" fiasco lately assumes avx which means it's fairly late model CPU is necessary. I think a lot of distributions target v3 now which will no longer run on first and even second or third rev CPUs) and building the binaries for yourself may be your only option.

IIRC:
x86_64 base: All 64-bit CPUs starting with K8 and P4.
x86_64_v2: SSE4
x86_64_v3: AVX2
x86_64_v4: AVX512

Fat binaries that have multiple code streams is the best way to optimize for old and new CPUs without SIGILL's, but gcc does not do this... And AVX tends to not really affect how fast /bin/cat runs ... though there are segments like strcpy and memset which can benefit a bit from avx but I doubt that most people would notice. One time I noticed gcc decided to use avx to clear a register instead of xoring it with itself (or loading an immediate 0 into the register). AVX does take fewer cycles than the other two so it is faster, but it doesn't happen very often and I think it's annoying because it breaks compatibility with my old cpus.
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Sun Mar 30, 2025 8:05 am

Is there any way I can build a 'fat binary'? Because I'm trying to cross compile, but because of toolchain bugs (they run the cross-compiled code), many packages are failing. There are no x64 emulators which support avx512 (the source my problem).

I wonder how those Arch guys would react to this. They build for x64 baseline. Yeah, their packages might be latest but the instructions are 2003 era...
My blog
Top
Zucca
Administrator
Administrator
User avatar
Posts: 4698
Joined: Thu Jun 14, 2007 10:31 pm
Location: Rasi, Finland
Contact:
Contact Zucca
Website

  • Quote

Post by Zucca » Sun Mar 30, 2025 12:44 pm

There was (is?) FatELF project. Don't ask me how to use or incorporate it into portage build/packaging processes.
..: Zucca :..

Code: Select all

init=/sbin/openrc-init
-systemd -logind -elogind seatd
I am NaN! I am a man!
Top
Hu
Administrator
Administrator
Posts: 24400
Joined: Tue Mar 06, 2007 5:38 am

  • Quote

Post by Hu » Sun Mar 30, 2025 1:38 pm

Cross-compiling normally means that the compiler is producing output that is for a foreign architecture and therefore cannot be run locally, no matter how modern the build CPU is. I don't think a fat binary would help you, because if the offending package were even slightly well behaved, it would be respecting your existing CFLAGS that tell is not to use modern instructions - assuming you did set your flags properly. A package that is so poorly behaved that it ignores your CFLAGS will likely also ignore the CFLAGS used to tell it to produce a fat binary.

Could you provide a little more detail on what is happening here? On what generation CPU are you trying to run code? Where did you get the code that is not working, and how was it built?
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Sun Mar 30, 2025 2:51 pm

Zucca wrote:There was (is?) FatELF project. Don't ask me how to use or incorporate it into portage build/packaging processes.
This is about changing the ELF format (and therefore requires patching the kernel). If ELF could be modified for this purpose by Linux, this would be extremely attractive in these mixed x64-arm days.
My blog
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Sun Mar 30, 2025 3:08 pm

Hu wrote:Cross-compiling normally means that the compiler is producing output that is for a foreign architecture and therefore cannot be run locally, no matter how modern the build CPU is. I don't think a fat binary would help you, because if the offending package were even slightly well behaved, it would be respecting your existing CFLAGS that tell is not to use modern instructions - assuming you did set your flags properly. A package that is so poorly behaved that it ignores your CFLAGS will likely also ignore the CFLAGS used to tell it to produce a fat binary.

Could you provide a little more detail on what is happening here? On what generation CPU are you trying to run code? Where did you get the code that is not working, and how was it built?
I've a zen1 machine and producing binaries for alder/raptorlake and icelake using crossdev. Now certain packages (like chromium, x11-libs/gtk+:3, www-client/firefox, x11-libs/gdk-pixbuf etc...) have a broken toolchain in a sense that in a stage of the build process, they're executing freshly compiled binaries with -march=alderlake/raptorlake/icelake on the build host which is a zen1 machine. Therefore, the compilation process fails with errors like --

Code: Select all

traps: protoc[43437] trap invalid opcode ip:562f545d6b90 sp:7ffd03ae4660 error:0 in protoc[2e8b90,562f5438c000+351000]
traps: ocloc-24.35.1[2971303] trap invalid opcode ip:7f4be20a9720 sp:7ffd921180e0 error:0 in libocloc.so
etc...

This is mostly occurring with icelake because of the avx512 BS that intel did. Of course disabling those instructions in CFLAGS resolves the issue, but I'm trying to avoid doing that.

Also there are no emulator which support avx512 (like qemu-x86_64), so I've no choice other than reporting bugs and adding -mno-avx512f to CFLAGS for the time being.
My blog
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56094
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Sun Mar 30, 2025 3:19 pm

dE_logics,

You can make the kernel do what you want with an Illegal Instruction exception.

The last time that I recall it did any more than kill the offending process was in the days of the 386 and 486SX, both of which lacked hardware floating point.
The kernel could be built with floating point emulation, so that when a floating point instruction was trapped, instead of the process being killed, the kernel would execute the instruction in software.

It's possible to patch the kernel to do the same with any instructions but its probably faster to avoid them than to emulate them.
Intel have Intel® Software Development Emulator (Intel® SDE)
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Mon Mar 31, 2025 6:36 am

NeddySeagoon wrote:dE_logics,

You can make the kernel do what you want with an Illegal Instruction exception.

The last time that I recall it did any more than kill the offending process was in the days of the 386 and 486SX, both of which lacked hardware floating point.
The kernel could be built with floating point emulation, so that when a floating point instruction was trapped, instead of the process being killed, the kernel would execute the instruction in software.

It's possible to patch the kernel to do the same with any instructions but its probably faster to avoid them than to emulate them.
Intel have Intel® Software Development Emulator (Intel® SDE)
So the kernel could execute intel's SDE with that binary in case it trapped an Illegal Instruction? Is there any framework like this?
My blog
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56094
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Mon Mar 31, 2025 7:25 am

dE_logics,

That Intel SDE is a user space program that is used on top of the kernel.
It will emulate Intel instructions missing frow the real hardware, for programs rum under its control.
Its unlikely to emulate AMD instruction extensions :)

Its not a kernel patch, or kernel option, which is what I think you would like, so that it just worked.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Mon Mar 31, 2025 8:17 am

No, actually SDE failed major. You may like to see this post.
My blog
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Mon Mar 31, 2025 9:32 am

Back up a sec. I still am not sure what you're doing here...trying to install Linux on a K8 or P4 that only supports x86_64_v1?
Did Gentoo already require v3 on stage3 or something?
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Mon Mar 31, 2025 10:26 am

In brief, I'm trying to cross compile x86-64-v4 binaries on x86-64-v3.

stage3 must be in baseline.
My blog
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Mon Mar 31, 2025 2:41 pm

You can use the x86-64-v3 as a distcc host, so the v4 machine can do the remainder of stuff so the v3 machine doesn't have to run any v4 binaries? Yes if the v4 machine is a MHz/core/RAM limited laptop then it would be a pain but I think most modern CPUs with a mere 8GiB should be fine.

Except if it's an Atom ...
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Mon Mar 31, 2025 3:47 pm

I'm just trying to avoid frying the laptop like it happened last time. I ran gentoo on it since 2009 to I think at 2010 I died. Although I believe other CPU intensive tasks where also to blame.

So I rather drop avx512 instructions.

What happens with distcc is that that compiling is done remotely, but everything else (including linking) is done locally?
My blog
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Mon Mar 31, 2025 5:08 pm

I've ran my laptops full tilt to build Gentoo frequently and it's been fine, though i5's are the most I've ever had. It's a matter of making sure the fan remains clear of dust and blockages, IMHO. I've also run my atom laptop 5 days straight doing Gentoo upgrades, so it's almost compiled 24 hours/day for 5 days. Pretty much the only breaks it get is if portage crashes out and I have to fix something and restart.

Except for things that cannot be distributed, distcc will allow the compilation to be run on other machines. Preprocess, linking are still done on the local machine. Yes unfortunately LTO is linking so it runs locally too. The Atoms are so slow that sometimes it can't keep up with the preprocessing and my helpers starve of stuff to do. The single core Atom laptop frequently does this unfortunately. The quad core atom server sometimes does it too but not as pronounced as the single core. My Core2 Quads (which are at least 2x the speed of my quad core atom) don't exhibit this issue and can keep my helpers busy for the most part. The dual core i5's also are able to keep helpers busy...

Then again I think distcc significantly helps:

- webkit-gtk
- qtwebengine
- firefox/thunderbird
- chromium
- llvm / clang
- nodejs
- vtk

There are some more that I can't remember off the top of my head at the moment. It's good seeing all the helpers churning away...

The other packages tend to build fast enough such that the benefit from distcc isn't as noticeable. Well, except the packages like rust and gcc that don't distribute as they depend on itself.
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Tue Apr 01, 2025 6:07 am

So what I can do is use distcc when a package cannot be cross compiled. It'll be a backup option.

Thanks for suggesting this.

For for the story of by burnt out laptop, it was an Athlon x2. The CPU didn't have problems, but the mobo did.
My blog
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Tue Apr 01, 2025 10:54 pm

Well, self host...the machine with the largest instruction set still needs to build but can offload compilation work to the other machines. If so desired you can just not have any compilation done on the machine. The helpers will send back object files that are tailored to what your CFLAGS dictate and they don't ever need to run the code they generated.

I do have to say one caveat of Chromium, Firefox, and Thunderbird distcc: they have a bit of rust in them and that can't be distributed. However there's a lot of C++ that can.

Nodejs, QTWebengine, and webkit-gtk all hammer the distcc helpers. However, I'm kind of surprised qtwebengine doesn't have rust in it yet?
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Wed Apr 02, 2025 8:10 am

Yup, rust is the future. It even ended up in the kernel.

sccache is like distcc for rust.
My blog
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Thu Apr 03, 2025 4:19 am

In the mean time this cross boss script works for many packages.
My blog
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Mon Apr 28, 2025 7:16 am

Therefore I did some benchmarks (taking advantage of the current situation).

http://delogics.blogspot.com/2025/04/de ... hmark.html
My blog
Top
dE_logics
Advocate
Advocate
User avatar
Posts: 2350
Joined: Fri Jan 02, 2009 3:20 am
Location: $TERM
Contact:
Contact dE_logics
Website

  • Quote

Post by dE_logics » Mon Aug 11, 2025 9:30 am

So I had to migrate my crossdev setup which I was using to maintain my laptop to a chroot-maintained setup on my non-avx512 capable workstation. Taking advantage of the situation I did some with vs without avx512 benchmark on the same machine.

http://delogics.blogspot.com/2025/08/ge ... thout.html
My blog
Top
Post Reply

22 posts • Page 1 of 1

Return to “Kernel & Hardware”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic