[FFmpeg-devel] [PATCH v3 1/3] checkasm/diracdsp: test add_dirac_obmc

Thu Nov 14 18:29:59 EET 2024

Hi,

On Thu, Nov 14, 2024 at 10:18 AM James Almer <jamrial at gmail.com> wrote:

> On 11/14/2024 11:30 AM, Kyosuke Kawakami wrote:
> > Signed-off-by: Kyosuke Kawakami <kawakami150708 at gmail.com>
> > ---
> >   tests/checkasm/Makefile   |  1 +
> >   tests/checkasm/checkasm.c |  3 ++
> >   tests/checkasm/checkasm.h |  1 +
> >   tests/checkasm/diracdsp.c | 86 +++++++++++++++++++++++++++++++++++++++
> >   tests/fate/checkasm.mak   |  1 +
> >   5 files changed, 92 insertions(+)
> >   create mode 100644 tests/checkasm/diracdsp.c
>
> [...]
>
> > diff --git a/tests/checkasm/diracdsp.c b/tests/checkasm/diracdsp.c
> > new file mode 100644
> > index 0000000000..8833c2d223
> > --- /dev/null
> > +++ b/tests/checkasm/diracdsp.c
> > @@ -0,0 +1,86 @@
> > +/*
> > + * Copyright (c) 2024 Kyosuke Kawakami
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> along
> > + * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
> > + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
> > + */
> > +
> > +#include "checkasm.h"
> > +
> > +#include "libavcodec/diracdsp.h"
> > +
> > +#include "libavutil/intreadwrite.h"
> > +#include "libavutil/mem_internal.h"
> > +
> > +#define RANDOMIZE_DESTS(name, size)             \
> > +    do {                                        \
> > +        int i;                                  \
> > +        for (i = 0; i < size; ++i) {            \
> > +            uint16_t r = rnd();                 \
> > +            AV_WN16A(name##0 + i, r);           \
> > +            AV_WN16A(name##1 + i, r);           \
> > +        }                                       \
> > +    } while (0)
> > +
> > +#define RANDOMIZE_BUFFER8(name, size)         \
> > +    do {                                      \
> > +        int i;                                \
> > +        for (i = 0; i < size; ++i) {          \
> > +            uint8_t r = rnd();                \
> > +            name[i] = r;                      \
> > +        }                                     \
> > +    } while (0)
> > +
> > +#define OBMC_STRIDE 32
> > +#define XBLEN_MAX 32
> > +#define YBLEN_MAX 64
> > +
> > +static void check_add_obmc(size_t func_index, int xblen)
> > +{
> > +    LOCAL_ALIGNED_8(uint8_t, src, [XBLEN_MAX * YBLEN_MAX]);
> > +    LOCAL_ALIGNED_16(uint16_t, dst0, [XBLEN_MAX * YBLEN_MAX]);
> > +    LOCAL_ALIGNED_16(uint16_t, dst1, [XBLEN_MAX * YBLEN_MAX]);
>
> The loads in the asm functions use movdqu, so i assume the buffers in
> the decoder are not 16 byte aligned. To ensure future implementations
> don't mistakenly use aligned loads, you could make this be:
>
> LOCAL_ALIGNED_16(uint16_t, _dst0, [XBLEN_MAX * YBLEN_MAX + 4]);
> LOCAL_ALIGNED_16(uint16_t, _dst1, [XBLEN_MAX * YBLEN_MAX + 4]);
> uint16_t *dst0 = _dst0 + 4, *dst1 = _dst1 + 4;
>
> Using LOCAL_ALIGNED_8() could also end up with a 16 byte aligned buffer,
> so the above will make sure the buffer is 8 byte aligned.
>
> > +    LOCAL_ALIGNED_8(uint8_t, obmc_weight, [XBLEN_MAX * YBLEN_MAX]);
> > +
> > +    int yblen;
> > +    DiracDSPContext h;
> > +
> > +    ff_diracdsp_init(&h);
> > +
> > +    if (check_func(h.add_dirac_obmc[func_index],
> "diracdsp.add_dirac_obmc_%d", xblen)) {
> > +        declare_func(void, uint16_t*, const uint8_t*, int, const
> uint8_t *, int);
> > +
> > +        yblen = 1 + (rnd() % YBLEN_MAX);
>
> Use YBLEN_MAX directly. No real gain in using randomized height, and
> this way every --bench run will give wildly different results.
>

The bench should use max_height, but the test should use a randomized
height, IMO.

Ronald