[FFmpeg-devel] [RFC] Loop unrolling in C code for 'vector_fmul_*' functions
Michael Niedermayer
michaelni
Mon Jan 14 04:13:44 CET 2008
On Sun, Jan 13, 2008 at 06:05:48PM +0100, Vadim Lebedev wrote:
> I'm running your program as follows:
>
>
> gcc 4.1.2 -O3 -fomit-frame-pointer -msse -o vector_fmul_test
> vector_fmul_test.c
> ./vector_fmul_test 2000
>
> And the output is:
>
> Function: 'vector_fmul_c', time=73.910 (cycles/element=288.713)
> Function: 'vector_fmul_c_unrolled', time=73.010 (cycles/element=285.195)
> Function: 'vector_fmul_c_other_unrolled', time=72.999
> (cycles/element=285.152)
> Function: 'vector_fmul_c_simd', time=0.141 (cycles/element=0.552)
>
> Any idea why it is so slow (except simd case)?
if i had to guess ...
maybe something with overflows and exceptions
try to set the arrays to 1.0
also theres another flaw in the test the arrays should be global or
volatile or so otherwise gcc could optimize the functions completely
out (yeah if it had a microscopic speck of intelligence it would
realize they are never read ...)
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Dictatorship naturally arises out of democracy, and the most aggravated
form of tyranny and slavery out of the most extreme liberty. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080114/b5ded109/attachment.pgp>
More information about the ffmpeg-devel
mailing list