[FFmpeg-devel] Anybody has a Core 2? [PATCH] Small SSSE3 optimization
Zuxy Meng
zuxy.meng
Wed May 9 12:16:36 CEST 2007
Hi,
2007/5/9, Guillaume POIRIER <poirierg at gmail.com>:
> Hi,
>
> On 5/9/07, Zuxy Meng <zuxy.meng at gmail.com> wrote:
> > Hi,
> >
> > 2007/5/8, Zuxy Meng <zuxy.meng at gmail.com>:
> > > Hi,
> > >
> > > Attached patch makes use of SSSE3 instruction pabsw to calculate the
> > > absolute value of packed words. Just for fun. And I don't have a SSSE3
> > > capable CPU so hopefully someone with a Core 2 can help test it to
> > > ensure it doesn't break anything (better with benchmarks of course:-)
> > > ).
> >
> >
> > Updated patch against curren SVN HEAD. Full test passed on MMX2. Of
> > course it still needs testing under Core 2.
>
> cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
> stepping : 6
> cpu MHz : 2000.055
> cache size : 4096 KB
> physical id : 0
> siblings : 2
> core id : 0
> cpu cores : 2
> fpu : yes
> fpu_exception : yes
> cpuid level : 10
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm constant_tsc pni monitor ds_cpl vmx tm2 cx16 xtpr lahf_lm
> bogomips : 4003.24
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
>
> [...]
>
> make codectest passes, make test passes, make fulltest passes.
>
> \o/ !!
Cool! Can u do a small unit-test to compare the MMX2 and SSSE3 version
of hadamard8_diff? Intel don't give the latency of pabsw in their
manuals (while AMD always give ALL instructions' latency & throughput)
but I guess it should be smaller than the sum of pxor, psubw and
pmaxsw:-)
--
Zuxy
Beauty is truth,
While truth is beauty.
PGP KeyID: E8555ED6
More information about the ffmpeg-devel
mailing list