[Ffmpeg-devel] int vs. float: Hard Numbers
Michael Niedermayer
michaelni
Sat May 21 01:04:05 CEST 2005
Hi
On Friday 20 May 2005 23:35, Attila Kinali wrote:
> Heyo,
>
> On Fri, 20 May 2005 12:13:24 -0600
>
> Mike Melanson <mike at multimedia.cx> wrote:
> > integer_adder() (10 adds) returned 50, 36 cycles used
> > float_adder() (10 adds) returned 50.000000, 36 cycles used
> > integer_mult() (10 mults) returned 9765625, 115 cycles used
> > float_mult() (10 mults) returned 9765625.000000, 36 cycles used
>
> integer_adder() (10 adds) returned 50, 46 cycles used
> integer_adder_nl() (10 adds) returned 20, 47 cycles used
> float_adder() (10 adds) returned 50.000000, 65 cycles used
> integer_mult() (10 mults) returned 9765625, 87 cycles used
> float_mult() (10 mults) returned 9765625.000000, 85 cycles used
>
> This was performed on a PentiumM:
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 13
> model name : Intel(R) Pentium(R) M processor 1.70GHz
> stepping : 6
> cpu MHz : 1699.061
> cache size : 2048 KB
> fdiv_bug : no
> hlt_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 2
> wp : yes
> flags : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat
> clflush dts acpi mmx fxsr sse sse2 ss tm pbe est tm2 bogomips :
> 3358.72
>
> integer_adder_nl is a small hack to see how much the result
> depends on the register stall, ie i replaced the addition
> loop by:
> ---
> add ecx, 5
> add eax, 5
> add edx, 5
> add ecx, 5
> add eax, 5
> add edx, 5
> add ecx, 5
> add eax, 5
> add edx, 5
> add ecx, 5
> ----
>
> For another test i added a for loop infront of each test function
> with 1000 iterations to check for cache dependencies. Interestingly
> i got exactly the same numbers as before.
i guess the function was inlined so it wasnt in the code cache after the loop,
only the inlined one inside the loop was, but thats just a guess
> I assume that the 2MB L2
> cache still contains the content of the program from the loading
> operation of the OS.
yes probably but L2 != L1, and L1 is normally split between data and code, so
the code must have been executed to be in the L1 cache
[...]
--
Michael
More information about the ffmpeg-devel
mailing list