[FFmpeg-devel] [PATCH] SSE2 Xvid idct
Michael Niedermayer
michaelni
Sun Apr 13 12:26:41 CEST 2008
On Sun, Apr 13, 2008 at 05:35:01AM -0400, Alexander Strange wrote:
>
> On Apr 12, 2008, at 8:15 AM, Michael Niedermayer wrote:
[...]
>>> "psubsw %%xmm6, %%xmm5 \n\t" \
>>> "movdqa "ROW0", %%xmm4 \n\t" \
>>> "movdqa "ROW4", %%xmm6 \n\t" \
>>> "movdqa %%xmm2, "spill" \n\t" \
>>> "movdqa %%xmm4, %%xmm2 \n\t" \
>>> "psubsw %%xmm6, %%xmm4 \n\t" \
>>> "paddsw %%xmm2, %%xmm6 \n\t" \
>>> "movdqa %%xmm6, %%xmm2 \n\t" \
>>> "psubsw %%xmm7, %%xmm6 \n\t" \
>>> "paddsw %%xmm2, %%xmm7 \n\t" \
>>> "movdqa %%xmm4, %%xmm2 \n\t" \
>>> "psubsw %%xmm5, %%xmm4 \n\t" \
>>> "paddsw %%xmm2, %%xmm5 \n\t" \
>>> "movdqa %%xmm5, %%xmm2 \n\t" \
>>> "psubsw %%xmm0, %%xmm5 \n\t" \
>>> "paddsw %%xmm2, %%xmm0 \n\t" \
>>> "movdqa %%xmm4, %%xmm2 \n\t" \
>>> "psubsw %%xmm3, %%xmm4 \n\t" \
>>> "paddsw %%xmm2, %%xmm3 \n\t" \
>>> "movdqa "spill", %%xmm2 \n\t" \
>>
>> #ifdef ARCH_X86_64
>> # define XMMS "%%xmm12"
>> #else
>> # define XMMS "%%xmm2"
>> #endif
>> s/%%xmm2/XMMS/
>>
>> #ifndef ARCH_X86_64
>> "movdqa %%xmm2, "spill" \n\t" \
>> #endif
>> ...
>> #ifndef ARCH_X86_64
>> "movdqa "spill", %%xmm2 \n\t" \
>> #endif
>>
>> or a
>> MOV_ONLY_ON32" %%xmm2, ...
>>
>>
>> And i think something similar can be don with ROW*
>
> Done. The row part is already optimal on 64 since pshufhw handles it.
I meant the
> "movdqa "ROW2", %%xmm4 \n\t" \
> "movdqa "ROW6", %%xmm6 \n\t" \
[...]
> "movdqa "ROW0", %%xmm4 \n\t" \
> "movdqa "ROW4", %%xmm6 \n\t" \
they are unneeded on 64.
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080413/88d36cb6/attachment.pgp>
More information about the ffmpeg-devel
mailing list