[Ffmpeg-devel] int vs. float: Hard Numbers
Tuukka Toivonen
tuukkat
Mon May 23 10:00:57 CEST 2005
On Fri, 20 May 2005, matthieu castet wrote:
> I have try to do an 'portable' test and I have strange result :
> gcc t1.c -O3
>
> r1+=r2 11
> r1*=r2 44
> f1+=f2 4652
That is nonsensical. My first guess: you don't initialize the float value
what you add and it ends up something invalid like NaN or QNaN which takes
very long from FPU to process and in the worst case might cause signal (but
then the program would crash...)
A problem with standard x86 FPU is that you can't use freely the registers
like with integer opcodes as it works like a stack; Intel boosted-up fxch
instruction that it can execute in parallel with the previous FPU
instruction and make the stack work more like a register file but it still
a cumbersome kludge and the very least makes larger codesize which fits
worse in the cache. And you can't use FPU with MMX, of course.
If you do X times integer add, it shouldn't take more than X/2 clocks
even on plain Pentium... IF the code is in cache and the results don't
depend on each other. So 36 cycles from 10 adds sounds like very much. Of
course with practical (gcc generated) code you don't hit the optimal, but
it still should be better than 36... at least on AMD/Intel CPUs.
More information about the ffmpeg-devel
mailing list