Top level intermodule profiledbootstrap, for people with too much free time.

Robert Connolly robert at linuxfromscratch.org
Fri Feb 9 10:35:29 PST 2007


I wrote this a few weeks ago...

# The following is a study of possible ways to optimize the GNU toolchain.
#
# First of all, I made the assumption that using the Top Level Makefile
# system, included with the recent GCC and Binutils versions, has a greater
# potential for optimizing the applications than building them each 
# separately. I also did not study non-bootstrapped builds because a 
# non-bootstrapped toolchain has no reason to perform better than a 
# bootstrapped toolchain.

# I performed this study with all the tricks I know of to optimize the
# toolchain, including intermodule builds, profiled builds, and GNU hash
# style.

# The top level makefile system also adds the potential of profiling a pile of
# other packages, like flex and bison and gettext and bash and findutils and
# much more. But I have yet to get that to work for me.

# The --enable-intermodule option will make GCC use the -combine option
# to compile all sources in the same command line. The idea is that the
# compiler can optimize all the sources together and make better judgments.
# This option should also take better advantage of the -O3 option. The only
# other package I know of that uses -combine (and also uses -fwhole-program)
# is Busybox, because it not only increases performance but also decreases
# program size (about 1%).

# My system is a Pentium 4 prescott, 3GHZ, with 1024MB of physical memory.
# Running kernel 2.6.19.2, and Glibc-2.5. This toolchain version's are the
# same versions on the chrooted host system (chapter 6 LFS). I have a 1024MB
# swap partition, and added a 512MB swap file, giving a total of 2.5GB of
# system memory.

# I only performed each build and test once, so there may be variations on
# your results, but my results should be reasonably valid.

tar xf gcc-g++-4.1-20070108.tar.bz2
tar xf gcc-core-4.1-20070108.tar.bz2
tar xf gcc-testsuite-4.1-20070108.tar.bz2
mv gcc-4.1-20070108/ butterfly-toolchain
cd butterfly-toolchain/

# This patch will cause GCC to link with --hash-style=gnu. This will cause
# programs to have better run times. Get details from the web or manual pages.

patch -Np0 -i ../gcc41-hash-style-gnu.patch

# This patch has nothing to do with performance, but is needed in order to
# pass Glibc-2.5's testsuite:

patch -Np0 -i ../gcc-DW_CFA_val.patch

# This Sed command is a workaround for some sort of bug. The GCOV_VERSION
# variable, "0x34303170" here, depends on your GCC version and might be
# different for you. I suggest you skip this command, continue, and you'll
# get an error during the build. Then find and read "gcov-iov.h" to get the
# GCOV_VERSION version, and then start over and use it here:

sed -e \
 's@#include \"gcov-iov.h\"@\
#define GCOV_VERSION \(\(gcov_unsigned_t\)0x34303170\)@' -i gcc/gcov-io.h

sed -i 's@\./fixinc\.sh at -c true@' gcc/Makefile.in
sed -i 's/@have_mktemp_command@/yes/' gcc/gccbug.in

tar xf ../binutils-2.17.50.0.9.tar.bz2
ln -s binutils-2.17.50.0.9/{bfd,binutils,gas,gprof,ld,opcodes} .

# Which CFLAGS you should use is another story, I'm using these:

export CFLAGS="-march=prescott -mtune=prescott -O3 -fomit-frame-pointer -pipe"
export CFLAGS="$CFLAGS -fexpensive-optimizations -DNDEBUG"
export CXXFLAGS="$CFLAGS"

# Using 'BOOT_CFLAGS="$CFLAGS"' with the 'make' command won't pass BOOT_CFLAGS
# down to ld/, bfd, and friends. We need to adjust it in the Makefile's.

# First wipe out mh-x86omitfp so it's BOOT_CFLAGS doesn't play any role:

dd if=/dev/null of=config/mh-x86omitfp count=1

# Then set the BOOT_CFLAGS in the Makefile's:

sed "s/^BOOT_CFLAGS.*/BOOT_CFLAGS = $CFLAGS/" \
	-i Makefile.{in,tpl} gcc/Makefile.in

mkdir obj
cd obj

# Beware: the combination of --enable-intermodule, profiledbootstrap, and -O3
# will eat up about 2.2GB of memory. Make should you add enough swap
# space/files for this. I suggest 2.5GB total (including physical RAM). It
# will also take a really long time (see below). I suggest you run this about
# 20 minutes before going to sleep, then watch TV etc for 20 minutes, and
# check the build is going okay, then go to sleep. Hopefully it will be
# finished when you wake.

# The -DNDEBUG option in CFLAGS will cause warnings about unused variables,
# because -DNDEBUG will disable the usefullness of assert(3), which also
# increases performance, and this will require the --disable-werror to be
# used.

# For reasons I didn't explore I couldn't get --enable-shared to work with
# --enable-intermodule. libgcc.so will still be built, but not the Binutils
# shared libraries. This is unfortunate, but will have no adverse performance
# effects, and I'm pretty sure you won't notice it. It means libbfd.a will be
# statically linked into each Binutils application, making them slightly
# larger, upwards of 480KB larger depending on how much of the library goes
# unused.

../configure --prefix=/usr \
    --libexecdir=/usr/lib --enable-clocale=gnu \
    --enable-threads=posix --enable-__cxa_atexit \
    --disable-werror --disable-checking \
    --with-cpu=prescott \
    --enable-bootstrap --enable-intermodule

# Use nice(1) so your system isn't a snail while you're using 2GB of swap:

time { nice make tooldir=/usr profiledbootstrap 2>&1 | tee make.log ; }

# The following are build and test suite times of various configurations.
# The test suite times hopefully represents run time results, even though it
# depends highly on the host system, the host system components don't change
# so the results should be a fair comparison. I had zero "unexpected
# failures" from all tests, but 'make CFLAGS="" CXXFLAGS="" -k check' needs to 
# be used to reset the CFLAGS because -O3 causes some failures in Binutils.

# My SBU, for Binutils-alone without any set CFLAGS, is:
# real    2m53.189s (173 seconds)
# user    2m8.530s
# sys     0m30.220s

# 1
# Build time of Butterfly is 103.5 SBU:
# real    298m37.739s (17917 seconds)
# user    58m41.080s
# sys     5m6.360s
#
# Time to run the test suite is 19.6 SBU:
# real    56m41.453s (3401 seconds)
# user    46m28.930s
# sys     9m37.490s

# 2
# Build time of Butterfly without '--enable-intermodule' is 16.5 SBU:
# real    47m49.584s (2869 seconds)
# user    42m40.570s
# sys     4m0.770s
#
# Time to run the test suite is 20.0 SBU:
# real    57m55.046s (3475 seconds)
# user    47m13.770s
# sys     9m54.990s

# 3
# Build time of Butterfly without '--enable-intermodule' and with 'bootstrap'
# instead of 'profiledbootstrap' is 12.0 SBU:
# real    34m38.662s (2078 seconds)
# user    30m22.580s
# sys     3m30.010s
#
# Time to run the test suite is 20.6 SBU:
# real    59m33.658s (3573 seconds)
# user    48m56.940s
# sys     9m58.580s


# The performance results between a normal bootstrap and non-bootstrap should
# be exactly the same, so it's not noted here.

# Results:
# Build number 3 is the vanilla base-line time.
#
# The formula is:
# comparison-time divided by baseline-time and then multiply the result by 100
#
# Build number 2 performs 3% better compared to build number 3.
# Build number 1 performs 5% better compared to build number 3.
#
# So '--enable-intermodule --enable-bootstrap && make profiledbootstrap' wins,
# but only by 5%.
#
# Install with:

make tooldir=/usr install

#

robert
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.linuxfromscratch.org/pipermail/lfs-chat/attachments/20070209/26e0e1e5/attachment.sig>


More information about the lfs-chat mailing list