[FFmpeg-devel] [PATCH 3/3] x86/vp9lpf: use fewer instructions in SPLATB_MIX
Ronald S. Bultje
rsbultje at gmail.com
Mon Aug 4 18:20:37 CEST 2014
Hi,
On Mon, Aug 4, 2014 at 12:17 PM, James Almer <jamrial at gmail.com> wrote:
> On 04/08/14 10:27 AM, Ronald S. Bultje wrote:
> > Hi,
> >
> >
> > On Sun, Aug 3, 2014 at 10:53 PM, James Almer <jamrial at gmail.com> wrote:
> >
> >> Signed-off-by: James Almer <jamrial at gmail.com>
> >> ---
> >> libavcodec/x86/vp9lpf.asm | 5 ++---
> >> 1 file changed, 2 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/libavcodec/x86/vp9lpf.asm b/libavcodec/x86/vp9lpf.asm
> >> index c5db0ca..def7d5a 100644
> >> --- a/libavcodec/x86/vp9lpf.asm
> >> +++ b/libavcodec/x86/vp9lpf.asm
> >> @@ -302,9 +302,8 @@ SECTION .text
> >> pshufb %1, %2
> >> %else
> >> punpcklbw %1, %1
> >> - punpcklqdq %1, %1
> >> - pshuflw %1, %1, 0
> >> - pshufhw %1, %1, 0x55
> >> + punpcklwd %1, %1
> >> + punpckldq %1, %1
> >
> >
> > Doesn't this miss the upper half of the register?
> >
> > Ronald
>
> Using the example above the macro
>
> ..............AB (start value)
> punpcklbw
> ............AABB
> punpcklwd
> ........AAAABBBB
> punpckldq
> AAAAAAAABBBBBBBB
Oh I see not a byte-splat, my bad, sorry please ignore my comment.
Ronald
More information about the ffmpeg-devel
mailing list