[Ffmpeg-devel] [PATCH] put_mpeg4_qpel16_h_lowpass altivec implementation
Brian Foley
bfoley
Mon Nov 20 02:58:55 CET 2006
On Mon, Nov 20, 2006 at 02:11:56AM +0100, Luca Barbato wrote:
> I like your plan but:
>
> - please attach patches w/out compressing them so is easier comment them
> from email.
>
> - create a separate file for everything, a name could be mpeg4_altivec.c
> or qpel_altivec.c and make it have a init function like the others.
>
> - try to stay on 79cols
Absolutely. Will do in future.
> - benchmark if calling the c version instead of duplicating them is faster.
OK. The duplication is needed here though, as the existing C functions
such as put_qpel16_mc10_altivec call the mpeg4_qpel16*_c functions
directly rather than using function pointers, presumably for speed
reasons. Also, when calling Altivec code these functions now have more
strict alignment requirements on their local variables, and I didn't
really want to go cluttering code in dsputil.c with changes that are
only relevant to PPC.
> - if you can produce a constant using a combination for at most 4 ops
> (like vec_splat_{u,s}{8,16,32}() and vec_{add,sr,sl,...}), check if that
> results in better performance (should). [gcc-4 may do that for you when
> it is simple like for AVV(16, 16, 16, 16, 16, 16, 16, 16);
Yes. I'd like to experiment a bit more with this later. I'm sure we
can gain a few more % speedup with tricks like this.
> - put_mpeg4_qpel16_v_lowpass_altivec ?
This is just a de-macro-fied version of a similar function in dsputil.c.
Copying it was the simplest way of getting all the put_qpel16_mc*_altivec
functions to work properly, and I hope to replace it with an Altivec
version soon.
> tomorrow I'll try with some samples
Great.
Cheers,
Brian.
More information about the ffmpeg-devel
mailing list