[FFmpeg-devel] [PATCH] SSE2 Xvid idct
Alexander Strange
astrange
Sun Apr 13 23:25:26 CEST 2008
On Apr 13, 2008, at 6:26 AM, Michael Niedermayer wrote:
> On Sun, Apr 13, 2008 at 05:35:01AM -0400, Alexander Strange wrote:
>>
>> On Apr 12, 2008, at 8:15 AM, Michael Niedermayer wrote:
> [...]
>>>> "psubsw %%xmm6, %%xmm5 \n\t" \
>>>> "movdqa "ROW0", %%xmm4 \n\t" \
>>>> "movdqa "ROW4", %%xmm6 \n\t" \
>>>> "movdqa %%xmm2, "spill" \n\t" \
>>>> "movdqa %%xmm4, %%xmm2 \n\t" \
>>>> "psubsw %%xmm6, %%xmm4 \n\t" \
>>>> "paddsw %%xmm2, %%xmm6 \n\t" \
>>>> "movdqa %%xmm6, %%xmm2 \n\t" \
>>>> "psubsw %%xmm7, %%xmm6 \n\t" \
>>>> "paddsw %%xmm2, %%xmm7 \n\t" \
>>>> "movdqa %%xmm4, %%xmm2 \n\t" \
>>>> "psubsw %%xmm5, %%xmm4 \n\t" \
>>>> "paddsw %%xmm2, %%xmm5 \n\t" \
>>>> "movdqa %%xmm5, %%xmm2 \n\t" \
>>>> "psubsw %%xmm0, %%xmm5 \n\t" \
>>>> "paddsw %%xmm2, %%xmm0 \n\t" \
>>>> "movdqa %%xmm4, %%xmm2 \n\t" \
>>>> "psubsw %%xmm3, %%xmm4 \n\t" \
>>>> "paddsw %%xmm2, %%xmm3 \n\t" \
>>>> "movdqa "spill", %%xmm2 \n\t" \
>>>
>>> #ifdef ARCH_X86_64
>>> # define XMMS "%%xmm12"
>>> #else
>>> # define XMMS "%%xmm2"
>>> #endif
>>> s/%%xmm2/XMMS/
>>>
>>> #ifndef ARCH_X86_64
>>> "movdqa %%xmm2, "spill" \n\t" \
>>> #endif
>>> ...
>>> #ifndef ARCH_X86_64
>>> "movdqa "spill", %%xmm2 \n\t" \
>>> #endif
>>>
>>> or a
>>> MOV_ONLY_ON32" %%xmm2, ...
>>>
>>>
>>> And i think something similar can be don with ROW*
>>
>> Done. The row part is already optimal on 64 since pshufhw handles it.
>
> I meant the
>> "movdqa "ROW2", %%xmm4 \n\t" \
>> "movdqa "ROW6", %%xmm6 \n\t" \
> [...]
>> "movdqa "ROW0", %%xmm4 \n\t" \
>> "movdqa "ROW4", %%xmm6 \n\t" \
>
> they are unneeded on 64.
Oh, that. Done:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: idct_sse2_xvid.c
Type: application/octet-stream
Size: 15252 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080413/e76d35a9/attachment.obj>
-------------- next part --------------
More information about the ffmpeg-devel
mailing list