[FFmpeg-devel] FASTDIV macro
Diego Biurrun
diego
Wed Nov 12 17:58:21 CET 2008
On Sun, Nov 09, 2008 at 02:24:19PM +0000, M?ns Rullg?rd wrote:
> Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
> > Here is some very crude synthetic benchmarking program attached. Of
> > course it does not take into account possible cache misses on the
> > table access and also the fact that sometimes we may need to use
> > expressions like "b==1 ? a : FASTDIV(a, b)".
> >
> > The results are the following:
> >
> > --- Pentium-M, gcc 4.3.2 (-O2) ---
> > normaldiv(-1896828497) : time=2.195s
> > fastdiv_c(-1896828497) : time=0.564s
> > fastdiv_asm_x86(-1896828497) : time=0.416s
> >
> > --- Core2 (64-bit), gcc 4.1.2 (-O2) ---
> > normaldiv(-1896828497) : time=0.681s
> > fastdiv_c(-1896828497) : time=0.183s
> > fastdiv_asm_x86(-1896828497) : time=0.222s
>
> So plain C is faster than asm on Core2? Did you look at the generated
> code?
>
> > --- ARM11, gcc 4.3.1 (-O2) ---
> > normaldiv(-1896828497) : time=43.910s
> > fastdiv_c(-1896828497) : time=5.480s
> > fastdiv_asm_armv4(-1896828497) : time=5.049s
> > fastdiv_asm_armv6(-1896828497) : time=4.629s
>
> I ran a very similar test on Cortex-A8, and although I don't remember
> the exact figures, the order came out the same.
>
> I suspect that anything with a half-decent D-cache will benefit from
> the table trick. Cache-starved machines might suffer from the extra
> cache pollution the table causes, at least if they have a reasonably
> fast divide instruction. Some MIPS incarnations fall in the second
> category.
>
> Could someone please run the test on a PPC G4 and/or G5?
Here is the output from 5 runs on my G4:
normaldiv(-1896828497) : time=4.699s
fastdiv_c(-1896828497) : time=3.583s
normaldiv(-1896828497) : time=4.441s
fastdiv_c(-1896828497) : time=3.585s
normaldiv(-1896828497) : time=4.500s
fastdiv_c(-1896828497) : time=3.585s
normaldiv(-1896828497) : time=4.863s
fastdiv_c(-1896828497) : time=3.584s
normaldiv(-1896828497) : time=4.925s
fastdiv_c(-1896828497) : time=3.583s
normaldiv(-1896828497) : time=4.482s
fastdiv_c(-1896828497) : time=3.584s
Diego
More information about the ffmpeg-devel
mailing list