[Ffmpeg-devel] a little optim for a SSE version of H263_LOOP_FILTER
skal
skal65535
Mon Nov 6 07:35:24 CET 2006
Hi everybody,
> Message du 05/11/06 16:50
> >
> > in case, it seems to me a SSE version of
> > H263_LOOP_FILTER is possible by replacing
> > "psubusb %%mm4, %%mm2 \n\t"\
> > "movq %%mm2, %%mm3 \n\t"\
> > "psubusb %%mm4, %%mm3 \n\t"\
> > "psubb %%mm3, %%mm2 \n\t"\
> > at dsputil_mmx.c:587 (fresh cvs), by:
> > "psubusb %%mm4, %%mm2 \n\t"\
> > "pminub %%mm4, %%mm2 \n\t"\
> >
> > +maybe a little re-org of the loop (mm3 is gone).
>
> Please send patch, I'll try to benchmark the speed change.
>
> Note that movq is very slow on P4, so any code that removes
> mov(q|dqu|..) provides an interesting speed-up.
>
>
> > Well, this is just for the fun of it, since the speed-up
> > (if any) might not be worth a special version...
>
> Once I have a patch to play with, I can benchmark it on P4, PM, and K8... :)
sure, attached is the diff (test only!)
>
> > (gotta love these saturated instructions. All of h263's
> > UpDownRamp() with 2 instructions is quite fun)
>
> Mmmm... grep -r "UpDownRamp" libav* doesn't return anything here, as
> well as in google code search.
> What kind of code are you referring to?
It's the name used in the h263 ISO spec.
( e.g. : http://nova.postech.ac.kr/~dkim/course/cs703a/h263.pdf ,
says Google)
bye!
Skal
-------------- next part --------------
A non-text attachment was scrubbed...
Name: /home/massimin/h263_loopfilter_sse_test_only.diff
Type: application/octet-stream
Size: 655 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20061106/1996760f/attachment.obj>
More information about the ffmpeg-devel
mailing list