[FFmpeg-devel] [PATCH 08/11] avcodec/v210enc: add AVX-512 10-bit line pack function
Martin Vignali
martin.vignali at gmail.com
Thu Nov 9 21:42:29 EET 2017
2017-11-09 12:58 GMT+01:00 James Darnley <jdarnley at obe.tv>:
> ---
> libavcodec/x86/v210enc.asm | 5 +++++
> libavcodec/x86/v210enc_init.c | 7 +++++++
> 2 files changed, 12 insertions(+)
>
> diff --git a/libavcodec/x86/v210enc.asm b/libavcodec/x86/v210enc.asm
> index 965f2bea3c..5068af27f8 100644
> --- a/libavcodec/x86/v210enc.asm
> +++ b/libavcodec/x86/v210enc.asm
> @@ -103,6 +103,11 @@ INIT_YMM avx2
> v210_planar_pack_10
> %endif
>
> +%if HAVE_AVX512_EXTERNAL
> +INIT_YMM avx512
> +v210_planar_pack_10
> +%endif
> +
> %macro v210_planar_pack_8 0
>
> ; v210_planar_pack_8(const uint8_t *y, const uint8_t *u, const uint8_t
> *v, uint8_t *dst, ptrdiff_t width)
> diff --git a/libavcodec/x86/v210enc_init.c b/libavcodec/x86/v210enc_init.c
> index e997b4b67a..e8aac373a0 100644
> --- a/libavcodec/x86/v210enc_init.c
> +++ b/libavcodec/x86/v210enc_init.c
> @@ -32,6 +32,9 @@ void ff_v210_planar_pack_10_ssse3(const uint16_t *y,
> const uint16_t *u,
> void ff_v210_planar_pack_10_avx2(const uint16_t *y, const uint16_t *u,
> const uint16_t *v, uint8_t *dst,
> ptrdiff_t width);
> +void ff_v210_planar_pack_10_avx512(const uint16_t *y, const uint16_t *u,
> + const uint16_t *v, uint8_t *dst,
> + ptrdiff_t width);
>
> av_cold void ff_v210enc_init_x86(V210EncContext *s)
> {
> @@ -51,4 +54,8 @@ av_cold void ff_v210enc_init_x86(V210EncContext *s)
> s->sample_factor_10 = 2;
> s->pack_line_10 = ff_v210_planar_pack_10_avx2;
> }
> +
> + if (EXTERNAL_AVX512(cpu_flags)) {
> + s->pack_line_10 = ff_v210_planar_pack_10_avx512;
> + }
> }
> --
>
>
I doesn't want to block this patch, but
like you say (in your previous version), that this version is not faster,
i'm not sure, it's interesting to apply it.
You already made "real" avx512 version for other funcs, in order to check
the rest of yours patchs.
Martin
More information about the ffmpeg-devel
mailing list