[Ffmpeg-devel] yet another silly int vs. float benchmark
Måns Rullgård
mru
Sat May 21 19:42:00 CEST 2005
Michael Niedermayer <michaelni at gmx.at> writes:
> Hi
>
> heres another benchmark proggy, advantages over the others
> 1. pure c
> 2. ~40 lines of code, can be easily done in less i know ...
> 3. tries to test both the case where each instruction depends upon the
> previous one and where the instructions are a little more independant
I tried it on my Alpha PCA56, but I keep getting a SIGFPE on the first
float operation. I disabled the first float add test, and got these
numbers:
100 ; needed 3 cycles -> 3 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[0]; needed 229 cycles -> 114 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[0]; needed 1827 cycles -> 913 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[0]; needed 804 cycles -> 402 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[2];iv[2]+=iv[3];iv[3]+=iv[4];iv[4]+=iv[5]; needed 282 cycles -> 56 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[2];iv[2]*=iv[3];iv[3]*=iv[4];iv[4]*=iv[5]; needed 2032 cycles -> 406 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[2];fv[2]+=fv[3];fv[3]+=fv[4];fv[4]+=fv[5]; needed 511 cycles -> 102 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[2];fv[2]*=fv[3];fv[3]*=fv[4];fv[4]*=fv[5]; needed 511 cycles -> 102 cycles per operation
Can someone (Falk?) explain the FPE? It goes away with -mieee, but
doing so slows things down a little:
100 ; needed 3 cycles -> 3 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[0]; needed 205 cycles -> 102 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[0]; needed 1804 cycles -> 902 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[0]; needed 13892 cycles -> 6946 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[0]; needed 822 cycles -> 411 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[2];iv[2]+=iv[3];iv[3]+=iv[4];iv[4]+=iv[5]; needed 262 cycles -> 52 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[2];iv[2]*=iv[3];iv[3]*=iv[4];iv[4]*=iv[5]; needed 2011 cycles -> 402 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[2];fv[2]+=fv[3];fv[3]+=fv[4];fv[4]+=fv[5]; needed 676 cycles -> 135 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[2];fv[2]*=fv[3];fv[3]*=fv[4];fv[4]*=fv[5]; needed 676 cycles -> 135 cycles per operation
--
M?ns Rullg?rd
mru at inprovide.com
More information about the ffmpeg-devel
mailing list