[FFmpeg-devel] [PATCH] lavc/pixblockdsp: specialise aligned 16-bit get_pixels
James Almer
jamrial at gmail.com
Thu Jul 25 19:16:21 EEST 2024
On 7/25/2024 12:53 PM, Rémi Denis-Courmont wrote:
> The current code assumes that we have unaligned rows, which hurts on
> platforms with slower unaligned accesses. (Also, this lets the compiler
> unroll manually, which it seems to do in practice.)
> ---
> libavcodec/pixblockdsp.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/pixblockdsp.c b/libavcodec/pixblockdsp.c
> index bbbeca1618..1fff244511 100644
> --- a/libavcodec/pixblockdsp.c
> +++ b/libavcodec/pixblockdsp.c
> @@ -26,6 +26,13 @@
>
> static void get_pixels_16_c(int16_t *restrict block, const uint8_t *pixels,
> ptrdiff_t stride)
Is there a way to hint the compiler that block is 16 byte aligned? GCC
14 at least emits unaligned loads and stores for these.
> +{
> + for (int i = 0; i < 8; i++)
> + AV_COPY128(block + i * 8, pixels + i * stride);
> +}
> +
> +static void get_pixels_unaligned_16_c(int16_t *restrict block,
> + const uint8_t *pixels, ptrdiff_t stride)
> {
> AV_COPY128U(block + 0 * 8, pixels + 0 * stride);
> AV_COPY128U(block + 1 * 8, pixels + 1 * stride);
> @@ -90,7 +97,7 @@ av_cold void ff_pixblockdsp_init(PixblockDSPContext *c, AVCodecContext *avctx)
> case 10:
> case 12:
> case 14:
> - c->get_pixels_unaligned =
> + c->get_pixels_unaligned = get_pixels_unaligned_16_c;
> c->get_pixels = get_pixels_16_c;
> break;
> default:
More information about the ffmpeg-devel
mailing list