[FFmpeg-devel] [RFC] Loop unrolling in C code for 'vector_fmul_*' functions

Vadim Lebedev vadim
Sun Jan 13 18:05:48 CET 2008


I'm running your program as follows:


gcc 4.1.2  -O3 -fomit-frame-pointer  -msse -o vector_fmul_test 
vector_fmul_test.c
./vector_fmul_test 2000

And the output is:

Function: 'vector_fmul_c', time=73.910 (cycles/element=288.713)
Function: 'vector_fmul_c_unrolled', time=73.010 (cycles/element=285.195)
Function: 'vector_fmul_c_other_unrolled', time=72.999 
(cycles/element=285.152)
Function: 'vector_fmul_c_simd', time=0.141 (cycles/element=0.552)

Any idea why it is so slow (except simd case)?
P.S.
This is my /proc/cpuinfo:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 CPU         T7200  @ 2.00GHz
stepping        : 6
cpu MHz         : 1000.000
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm 
constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips        : 3995.28
clflush size    : 64

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 CPU         T7200  @ 2.00GHz
stepping        : 6
cpu MHz         : 1000.000
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm 
constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips        : 3990.15
clflush size    : 64





More information about the ffmpeg-devel mailing list