[FFmpeg-devel] [PATCH] libavcodec/h264pred: Remove pred8x8_horizontal_8_mmxext

Henrik Gramner henrik at gramner.com
Sun Mar 3 01:31:29 EET 2024


On Sat, Mar 2, 2024 at 10:13 PM Kieran Kunhya <kierank at obe.tv> wrote:
>      SPLATB_LOAD m0, r0+r1*0-1, m2
>      SPLATB_LOAD m1, r0+r1*1-1, m2

This adds an extra unnecessary shuffle in the SSE2 code as it splats
to a full register. The easiest way of fixing it would probably be to
unroll the macro and manually get rid of it.

Although on x86-64 it might be faster to do a 1->8 byte splat using a
GPR multiply with 0x0101010101010101.


More information about the ffmpeg-devel mailing list