TITLE: x86-optimization LFS VERSION: any AUTHOR: Eric Olinger SYNOPSIS: How to use compiler-optimization setting with GCC to optimize binaries for an x86 systems HINT: ------ THANKS ------ Gerard Beekmans One of the Authors of the original Compiler-optimization hint and I paraphrased some of lfs-book 2.4.3 in the intro section. Thomas -Balu- Walter One of the Authors of the original Compiler-optimization hint, which I got some info for this hint from. The people at the Athlon Linux Project They have one of the few pages I found on optimization flags and what they mean besides the GCC online documentation. ------------ INTRODUCTION ------------ Most binaries are compiled with the -O2 option and little if any other optimization options. While this makes the binary portable, as its compiled for the i386 processor by default, it doesn't do much for the speed. There's a few way to change the default compile options. One is to Manually edit or patch the all the Makefile(s) in the src tree. This can be a time consuming process and not very efficient. The second is to set the CFLAGS and the CXXFLAGS environment variables. ---------------- COMPILER OPTIONS ---------------- For the minimal set of optimizations you can enter the following and 'unset' the environmental variably when your done the put it in your .bashrc file if you plan to us it all the time. export CFLAGS="-O3 -march=" && CXXFLAGS=$CFLAGS or for the maximum optimization possible, try the following: export CFLAGS="-s -O3 -fomit-frame-pointer -Wall \ -march= -malign-functions=4 \ -funroll-loops -fexpensive-optimizations -malign-double \ -fschedule-insns2 -mwide-multiply" && CXXFLAGS=$CFLAGS The minimal optimizations will almost always work on your system but you wont always be able to copy the binaries to other systems with a lower cpu. Some packages don't like either of these optimizations and either wont built or seg fault when you try to run it. If your having trouble getting a package to compile or run properly, try turning off most if not all the options, it probably has something to do with your compiler options. The fact that you don't have any problems compiling everything with -O3 doesn't mean you won't have any problems in the future. Another problem the Binutils version that's installed on your bootstrap system often causes compilation problems in Glibc (most noticeable in because RedHat often uses beta software which aren't always very . "RedHat likes living on the bleeding edge, but leaves the bleeding up to you" (quoted from somebody on the lfs-discuss mailinglist). ---------------------- DEFINITIONS FOR FLAGS ---------------------- For more information on compiler optimization flags see the GCC Command s page in the Online GCC 3.0 docs at: .gnu.org/onlinedocs/gcc-3.0/gcc_3.html Section 3.10 deals with option flags for general compiler optimization. n 3.17.15 deals with compiler optimization flags specific to the x86 line. -s A linker option that remove all symbol table and relocation information from the binary. -O3 This flag sets the optimizing level for the binary. 3 Highest level, machine specific code is generated. Auto-magically adds the -finline-functions and -frename-registers flags. 2 Most make files have this set up as Default, performs all supported optimizations that do not involve a space-speed tradeoff. Adds the -fforce-mem flag auto-magically. 1 Minimal optimizations are performed. Default for the compiler, if nothing is given. 0 Don't optimize. s Same as O2 but does additional optimizations for size. -fomit-frame-pointer Tells the compiler not to keep the frame pointer in a register for functions that don't need one. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions. It also makes debugging impossible on some machines. -Wall Enables all warning messages. -march=i686 Defines the instructions set to use when compiling. -mpcu is implied be the same as -march when only -march is used. i386 Default cpu type i486 Intel/AMD 486 processor i586 First generation pentium i686 Second generation pentium pentium Same as i586 pentiumpro Same as i686 pentium4 Intel Pentium 4 processor k6 k6, k6-2, k6-3 athlon Athlon/Duron -mcpu=i686 Sets the machine cpu-type to use when scheduling instructions. The definitions are the same as -mcpu. -malign-functions=4 This is an i386 option. Aligns the start of functions to a 2 raised to 4 byte boundary. If `-malign-functions' is not specified, the default is 2 if optimizing for a 386, and 4 if optimizing ior a 486. -funroll-loops This is an optimization option. Performs the optimization of loop unrolling. This is only done for loops whose number of iterations can be determined at compile time or run time. `-funroll-loops' implies both `-fstrength-reduce' and `-frerun-cse-after-loop'. -fexpensive-optimizations Another optimization option that performs a number of minor optimizations that are relatively expensive. -malign-double This is an i386 option. Controls whether GCC aligns double, long double, and long long variables on a two word boundary or a one word boundary. Aligning double variables on a two word boundary will produce code that runs somewhat faster on a `Pentium' at the expense of more memory. Warning: if you use the `-malign-double' switch, structures containing the above types will be aligned differently than the published application binary interface specifications for the 386. -fschedule-insns2 This is an optimization option. Similar to `-fschedule-insns', but requests an additional pass of instruction scheduling after register allocation has been done. This is especially useful on machines with a relatively small number of registers and where memory load instructions take more than one cycle. -mwide-multiply Control whether GCC uses the mul and imul that produce 64-bit results in eax:edx from 32-bit operands to do long long multiplies and 32-bit division by constants.