kimchi_sg may like to poke fun at me for using pentium classic as the default architecture for the Stage 1/3 guide -- i think that some of this is attributable to the fact that the's a ricer at heart -- but believe it or not, i did not choose "pentium" by accident, or because i was working exclusively with old hardware.

there's a good reason to specify "pentium" as the default architecture, especially if you're interested in performing installs on a wide variety of boxes -- it provides cross-platform compatability for all of the chips that have descended from the pentium architecture, and it effectively sidesteps the flag compatability problems that you've encountered.
for example, my P3 box has a spare HD and a spare gentoo installation. its built with flags for "pentium" even though its pentium 3. why? because when its time to install gentoo on a PPro, a K6, Pentium-MMX, or an Athlon, i'm not going to spend up to a week building each new system. all that i have to do is clone the hard disk, and stick it in the new box. at worst, i may have to make a few changes to the kernel. the new box is up and running in an hour or so.
on the other hand, if you want to trudge through the entire process of rebuilding glibc and gcc redundantly for every system you want to install, you can, but imho that amounts to doing things the hard way.
believe it or not, there's not THAT big of a difference in performance on most code between most processors running pentium code and their native code. the amount of incremental performance that is gained by generating optimized native code is lost many times over on the front-side as installation time. if you want to be up and running quickly, disk cloning from an ancestral platform is the way to go. if you're adamant that machine specific optimizations are necessary, its always more expedient and convenient to recompile on the machine after the OS is installed than it is to go through the trouble of building an optimized OS for each machine from the ground up.
i think that you've caught onto the right idea after perfoming a couple of the installs. the reason that you weren't successful is that you were compiling for too modern an arch to make the OS portable. if you're planning on performing the Stage 1/3 install on a large number of PCs, your best approach is to determine the most recent ancestral platform that all of your machines will share in common, perform the installation
ONCE, and then clone the installation to your other PCs. then if you feel the need, rebuild each PC afterward.
just to make this clear -- the ancestral platform doesn't have to be pentium if you've 100% intel or 100% amd. it just needs to be the most recent platform that all of your boxes are derived from. i just used pentium in the guide and on my boxes because its the least common denominator for me.
