[Ffmpeg-devel] int vs. float: Hard Numbers

Mon May 23 10:00:57 CEST 2005

On Fri, 20 May 2005, matthieu castet wrote:

> I have try to do an 'portable' test and I have strange result :
> gcc t1.c -O3
>
> r1+=r2 11
> r1*=r2 44

> f1+=f2 4652

That is nonsensical. My first guess: you don't initialize the float value 
what you add and it ends up something invalid like NaN or QNaN which takes 
very long from FPU to process and in the worst case might cause signal (but 
then the program would crash...)

A problem with standard x86 FPU is that you can't use freely the registers 
like with integer opcodes as it works like a stack; Intel boosted-up fxch 
instruction that it can execute in parallel with the previous FPU 
instruction and make the stack work more like a register file but it still 
a cumbersome kludge and the very least makes larger codesize which fits 
worse in the cache. And you can't use FPU with MMX, of course.

If you do X times integer add, it shouldn't take more than X/2 clocks
even on plain Pentium... IF the code is in cache and the results don't 
depend on each other. So 36 cycles from 10 adds sounds like very much. Of 
course with practical (gcc generated) code you don't hit the optimal, but 
it still should be better than 36... at least on AMD/Intel CPUs.