[Ffmpeg-devel] [RFC] svq1 very slow encoding
Trent Piepho
xyzzy
Sat Mar 31 02:57:35 CEST 2007
On Thu, 29 Mar 2007, Loren Merritt wrote:
>
> 65% of the cpu time was spent on one line. Clearly a candidate for simd.
>
> Patch makes the encode 2.3x faster on a athlon64. Additional speedups I
> tried but didn't include here: using inline instead of dsp adds another
> 10%, and 3dnow adds 3%.
static int ssd_int8_vs_int16_mmx(int8_t *pix1, int16_t *pix2, int size){
+ int sum;
+ long i=size;
+ asm volatile(
...
+ "movd %%mm4, %1 \n"
+ :"+r"(i), "=r"(sum)
+ :"r"(pix1), "r"(pix2)
+ );
+ return sum;
Shouldn't that be "+&r"(i)?
On x86-64, could "int sum" be put in a 64-bit register? Which would
generate something like "movd %mm4, %rax". Don't have a 64-bit system, but
can you use movd with a 64-bit general purpose register? If you can, isn't
it still wrong, since %rax will have garbage in the top 32 bits?
More information about the ffmpeg-devel
mailing list