[Ffmpeg-devel] [PATCH] put_mpeg4_qpel16_h_lowpass altivec, take 2
Michael Niedermayer
michaelni
Sun Nov 26 19:23:33 CET 2006
Hi
On Sun, Nov 26, 2006 at 05:23:35PM +0100, Luca Barbato wrote:
[...]
> > +
> > +static void put_pixels16_l2_altivec(uint8_t *dst, const uint8_t *src1,
> > + const uint8_t *src2, int dst_stride, int src_stride1,
> > + int src_stride2, int h)
> > +{
> > + register vector unsigned char src1v, src2v, dstv;
> > + register vector unsigned char tmp1, tmp2, mask, edges, align;
> > + int i;
> > +
> > + for(i=0; i<h; i++) {
> > + /* Unaligned load */
> > + src1v = vec_perm(
> > + vec_ld(0, src1), vec_ld(15, src1), vec_lvsl(0, src1));
> > + src2v = vec_perm(
> > + vec_ld(0, src2), vec_ld(15, src2), vec_lvsl(0, src2));
>
> if the stride is a multiple of 16 you could put vec_lvsl out the loop
all strides should in general be multiples of 16 on arch which benefit from
it (yes there are excpetions in obscure codecs ... but i think this one is ok,
without looking at the code ...)
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is
More information about the ffmpeg-devel
mailing list