[FFmpeg-devel] [PATCH v2 1/3] aarch64/vvc: Add w_avg

Zhao Zhili quinkblack at foxmail.com
Thu Sep 26 15:00:40 EEST 2024



> On Sep 26, 2024, at 19:25, Martin Storsjö <martin at martin.st> wrote:
> 
> On Mon, 23 Sep 2024, Zhao Zhili wrote:
> 
>> From: Zhao Zhili <zhilizhao at tencent.com>
>> 
>> w_avg_8_2x2_c:                                           0.0 ( 0.00x)
>> w_avg_8_2x2_neon:                                        0.0 ( 0.00x)
>> w_avg_8_4x4_c:                                           0.2 ( 1.00x)
>> w_avg_8_4x4_neon:                                        0.0 ( 0.00x)
>> w_avg_8_8x8_c:                                           1.2 ( 1.00x)
>> w_avg_8_8x8_neon:                                        0.2 ( 5.00x)
>> w_avg_8_16x16_c:                                         4.2 ( 1.00x)
>> w_avg_8_16x16_neon:                                      0.8 ( 5.67x)
>> w_avg_8_32x32_c:                                        16.2 ( 1.00x)
>> w_avg_8_32x32_neon:                                      2.5 ( 6.50x)
>> w_avg_8_64x64_c:                                        64.5 ( 1.00x)
>> w_avg_8_64x64_neon:                                      9.0 ( 7.17x)
>> w_avg_8_128x128_c:                                     269.5 ( 1.00x)
>> w_avg_8_128x128_neon:                                   35.5 ( 7.59x)
>> w_avg_10_2x2_c:                                          0.2 ( 1.00x)
>> w_avg_10_2x2_neon:                                       0.2 ( 1.00x)
>> w_avg_10_4x4_c:                                          0.2 ( 1.00x)
>> w_avg_10_4x4_neon:                                       0.2 ( 1.00x)
>> w_avg_10_8x8_c:                                          1.0 ( 1.00x)
>> w_avg_10_8x8_neon:                                       0.2 ( 4.00x)
>> w_avg_10_16x16_c:                                        4.2 ( 1.00x)
>> w_avg_10_16x16_neon:                                     0.8 ( 5.67x)
>> w_avg_10_32x32_c:                                       16.2 ( 1.00x)
>> w_avg_10_32x32_neon:                                     2.5 ( 6.50x)
>> w_avg_10_64x64_c:                                       66.2 ( 1.00x)
>> w_avg_10_64x64_neon:                                    10.0 ( 6.62x)
>> w_avg_10_128x128_c:                                    277.8 ( 1.00x)
>> w_avg_10_128x128_neon:                                  39.8 ( 6.99x)
>> w_avg_12_2x2_c:                                          0.0 ( 0.00x)
>> w_avg_12_2x2_neon:                                       0.2 ( 0.00x)
>> w_avg_12_4x4_c:                                          0.2 ( 1.00x)
>> w_avg_12_4x4_neon:                                       0.0 ( 0.00x)
>> w_avg_12_8x8_c:                                          1.2 ( 1.00x)
>> w_avg_12_8x8_neon:                                       0.5 ( 2.50x)
>> w_avg_12_16x16_c:                                        4.8 ( 1.00x)
>> w_avg_12_16x16_neon:                                     0.8 ( 6.33x)
>> w_avg_12_32x32_c:                                       17.0 ( 1.00x)
>> w_avg_12_32x32_neon:                                     2.8 ( 6.18x)
>> w_avg_12_64x64_c:                                       64.0 ( 1.00x)
>> w_avg_12_64x64_neon:                                    10.0 ( 6.40x)
>> w_avg_12_128x128_c:                                    269.2 ( 1.00x)
>> w_avg_12_128x128_neon:                                  42.0 ( 6.41x)
>> ---
>> libavcodec/aarch64/vvc/dsp_init.c | 34 +++++++++++
>> libavcodec/aarch64/vvc/inter.S    | 99 +++++++++++++++++++++++++------
>> 2 files changed, 116 insertions(+), 17 deletions(-)
>> 
>> diff --git a/libavcodec/aarch64/vvc/dsp_init.c b/libavcodec/aarch64/vvc/dsp_init.c
>> index ad767d17e2..b39ebb83fc 100644
>> --- a/libavcodec/aarch64/vvc/dsp_init.c
>> +++ b/libavcodec/aarch64/vvc/dsp_init.c
>> @@ -52,6 +52,37 @@ void ff_vvc_avg_12_neon(uint8_t *dst, ptrdiff_t dst_stride,
>>                        const int16_t *src0, const int16_t *src1, int width,
>>                        int height);
>> 
>> +void ff_vvc_w_avg_8_neon(uint8_t *_dst, const ptrdiff_t _dst_stride,
>> +                         const int16_t *src0, const int16_t *src1,
>> +                         const int width, const int height,
>> +                         uintptr_t w0_w1, uintptr_t offset_shift);
> 
> Including "const" on scalar parameters is entirely redundant, and we don't prescribe use of that elsewhere in ffmpeg, and just makes the whole declaration more noisy.

I see these “const” make clang-tidy not happy. They are here to keep consistent with the prototypes
in vvc/dsp.h. There are three options:

1. Keep “const” as current state
2. Drop “const” only for these new functions
3. Remove “const” from vvc/dsp.h and all implementations

I can’t decide which way to go.

> 
>> +void ff_vvc_w_avg_10_neon(uint8_t *_dst, const ptrdiff_t _dst_stride,
>> +                         const int16_t *src0, const int16_t *src1,
>> +                         const int width, const int height,
>> +                         uintptr_t w0_w1, uintptr_t offset_shift);
>> +void ff_vvc_w_avg_12_neon(uint8_t *_dst, const ptrdiff_t _dst_stride,
>> +                          const int16_t *src0, const int16_t *src1,
>> +                          const int width, const int height,
>> +                          uintptr_t w0_w1, uintptr_t offset_shift);
>> +/* When passing arguments to functions, Apple platforms diverge from the ARM64
>> + * standard ABI, that we can't implement the function directly in asm.
>> + */
> 
> It's fully possible to implement that in assembly, but it usually requires ugly ifdefs.
> 
> That said, I'm ok with this kind of wrapper, as it avoids the problem kinda neatly, but ifdefs in the assembly can also be needed at times.
> 
>> +#define W_AVG_FUN(bit_depth) \
>> +static void vvc_w_avg_ ## bit_depth(uint8_t *dst, const ptrdiff_t dst_stride, \
>> +    const int16_t *src0, const int16_t *src1, const int width, const int height, \
>> +    const int denom, const int w0, const int w1, const int o0, const int o1) \
>> +{ \
>> +    const int shift = denom + FFMAX(3, 15 - bit_depth); \
>> +    const int offset = ((o0 + o1) * (1 << (bit_depth - 8)) + 1) * (1 << (shift - 1)); \
> 
> Same about the superfluous "const" everywhere. For local variables, I guess it can be argued that marking them as const can aid readability in some way, but I don't think we generally prescribe doing that.
> 
> The rest of the patch seems fine, thanks!
> 
> // Martin
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org <mailto:ffmpeg-devel at ffmpeg.org>
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org <mailto:ffmpeg-devel-request at ffmpeg.org> with subject "unsubscribe".



More information about the ffmpeg-devel mailing list