-ffast-math + Mesa

Vladimir A. Pavlov pv4 at bk.ru
Thu Sep 7 11:33:52 PDT 2006


On Wednesday 06 September 2006 23:15, Kevin Day wrote:
> I also discovered a possible problem with either my system or the
> uclibc on my system.
> The code example you gave above did not have a non-zero errno value
> returned (all returned a 0) with all of the following:
>
> (gcc-4.1.1)
> # gcc a.c -lm
> # gcc a.c -lm -fmath-errno
> # gcc a.c -ffast-math
> # gcc a.c -ffast-math -fmath-errno

cat >a.c <<EOF
/* In the previous letter I forgot stdio.h and gcc4 gave me a warning 
about that. */
#include <stdio.h>
#include <math.h>
#include <errno.h>

int main ()
{
        volatile double s = sqrt (-1);
        printf ("%d\n", errno);
}
EOF
$ gcc4 -ffast-math a.c    
$ ./a.out 
0
$ gcc4 -ffast-math -fmath-errno a.c
$ ./a.out 
0
$ gcc4 -lm a.c
$ ./a.out 
33
$ gcc4 -lm -fmath-errno a.c
$ ./a.out 
33
$ gcc4 -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /home/pv/src/gcc-4.1.1/configure 
--prefix=/home/pv/pkg/gcc/4.1.1 --disable-multilib --enable-shared 
--enable-threads=posix --enable-version-specific-runtime-libs 
--enable-languages=c --disable-checking --disable-nls
Thread model: posix
gcc version 4.1.1

Everything seems ok except -fmath-errno doesn't work as expected. 
gcc-4.1.1 was built directly from gcc.gnu.org without any patches.

> uname -p returns: athlon-4
> ( Do you have a nice patch for uname that better handles the uname -p
> flags? I remember one of our patches already does that, but is not as
> "informative" as yours is)
I run the test from Gentoo machine. Gentoo patch makes uname just read 
the processor name from /proc/cpuinfo so Gentoo version of uname depends 
on /proc that's what I don't like.

It seems that the patch for this behavior is the attached one (extracted 
from coreutils-5.94-patches-1.2.tar.bz2/patch/generic) though I'm not 
sure (I haven't time to check this). You can also search for something 
like coreutils-VERSION-patches... at 
http://distfiles.gentoo.org/distfiles/.

> As a rather quick'n'dirty performance test, I tested the following:
> #include <math.h>
> #include <errno.h>
> 
> int main ()
> {
>        volatile double s = sqrt (-1);
>        long i;
> 
>        for (i=0; i < 100000; i++){
>                s = sqrt(-1);
>        }
> }
>
> Against -lm, it took ~0.005s, whereas -ffast-math took ~0.001s.

In my turn, I have the following:

$ cat a.c 
#include <math.h>
#include <stdio.h>
#include <errno.h>

int main ()
{
        volatile double s = sqrt (-1);
        long i;

        for (i=0; i < 1000000; i++)
                s = sqrt(-1);

        printf ("%d\n", errno);
}
$ gcc4 a.c -lm
$ time ./a.out 
33

real    0m0.056s
user    0m0.060s
sys     0m0.000s
$ gcc4 a.c -ffast-math
$ time ./a.out 
0

real    0m0.005s
user    0m0.000s
sys     0m0.000s
$ gcc4 a.c -ffast-math -fmath-errno
$ time ./a.out 
0

real    0m0.020s
user    0m0.020s
sys     0m0.000s
$ gcc3 a.c -ffast-math -fmath-errno
/tmp/ccOv3mc0.o: In function `main':
a.c:(.text+0x26): undefined reference to `sqrt'
a.c:(.text+0x47): undefined reference to `sqrt'
collect2: ld returned 1 exit status
$ gcc3 a.c -ffast-math -fmath-errno -lm
$ time ./a.out 
33

real    0m0.057s
user    0m0.060s
sys     0m0.000s
$ gcc3 -v
Reading specs 
from /home/pv/pkg/gcc/3.4.6/lib/gcc/i686-pc-linux-gnu/3.4.6/specs
Configured with: /home/pv/src/gcc-3.4.6/configure 
--prefix=/home/pv/pkg/gcc/3.4.6 --disable-multilib --enable-shared 
--enable-threads=posix --enable-version-specific-runtime-libs 
--enable-languages=c --disable-checking --disable-nls
Thread model: posix
gcc version 3.4.6

Note that "gcc4 -ffast-math -fmath-errno" produces slower code than 
"gcc4 -ffast-math" and faster one that "gcc4 -lm" though the executable 
still doesn't print the correct errno value in the former case.

What's also interesting is that gcc3 and gcc4 handle -fmath-errno flag 
in different ways. gcc4 just skips it while gcc3 uses a "usual" 
function call instead of processor instruction when the flag is used. 
It's this behavior that makes gcc3 giving the "undefined reference" 
message and producing so slow code as without -ffast-math flag.

-- 
Nothing but perfection
pv
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 003_all_coreutils-gentoo-uname.patch
Type: text/x-diff
Size: 3886 bytes
Desc: not available
URL: <http://lists.linuxfromscratch.org/pipermail/hlfs-dev/attachments/20060907/15920aff/attachment.patch>


More information about the hlfs-dev mailing list