[FFmpeg-devel] [PATCH] swscale/x86/rgb2rgb: add AVX512ICL versions of shuffle_bytes
Ronald S. Bultje
rsbultje at gmail.com
Sun Jan 26 00:27:33 EET 2025
Hi,
On Sat, Jan 25, 2025 at 9:26 AM Shreesh Adiga <16567adigashreesh at gmail.com>
wrote:
> @@ -64,6 +64,18 @@ cglobal shuffle_bytes_%1%2%3%4, 3, 5, 2, src, dst, w,
> tmp, x
> add dstq, wq
> neg wq
>
> +%if mmsize == 64
> + and xq, mmsize-4
> + shr xq, 2
> + mov tmpd, -1
> + shlx tmpd, tmpd, xd
> + not tmpd
> + kmovw k7, tmpw
> + vmovdqu32 m1{k7}{z}, [srcq + wq]
> + pshufb m1, m0
> + vmovdqu32 [dstq + wq]{k7}, m1
> + lea wq, [wq + 4 * xq]
> +%else
> ;calc scalar loop
> and xq, mmsize-4
> je .loop_simd
> @@ -80,6 +92,7 @@ cglobal shuffle_bytes_%1%2%3%4, 3, 5, 2, src, dst, w,
> tmp, x
> add wq, 4
> sub xq, 4
> jg .loop_scalar
> +%endif
Would it be possible to fix up the indentation a little bit? This is quite
ugly.
Ronald
More information about the ffmpeg-devel
mailing list