[FFmpeg-devel] [PATCH] v2: lavu/pixdesc: favour formats where depth and subsampling exactly match

Michael Niedermayer michael at niedermayer.cc
Thu Sep 15 00:08:16 EEST 2022


On Sat, Sep 10, 2022 at 12:23:30PM -0700, Philip Langdale wrote:
> Since introducing the various packed formats used by VAAPI (and p012),
> we've noticed that there's actually a gap in how
> av_find_best_pix_fmt_of_2 works. It doesn't actually assign any value
> to having the same bit depth as the source format, when comparing
> against formats with a higher bit depth. This usually doesn't matter,
> because av_get_padded_bits_per_pixel() will account for it.
> 
> However, as many of these formats use padding internally, we find that
> av_get_padded_bits_per_pixel() actually returns the same value for the
> 10 bit, 12 bit, 16 bit flavours, etc. In these tied situations, we end
> up just picking the first of the two provided formats, even if the
> second one should be preferred because it matches the actual bit depth.
> 
> This bug already existed if you tried to compare yuv420p10 against p016
> and p010, for example, but it simply hadn't come up before so we never
> noticed.
> 
> But now, we actually got a situation in the VAAPI VP9 decoder where it
> offers both p010 and p012 because Profile 3 could be either depth and
> ends up picking p012 for 10 bit content due to the ordering of the
> testing.
> 
> In addition, in the process of testing the fix, I realised we have the
> same gap when it comes to chroma subsampling - we do not favour a
> format that has exactly the same subsampling vs one with less
> subsampling when all else is equal.
> 
> To fix this, I'm introducing a small score penalty if the bit depth or
> subsampling doesn't exactly match the source format. This will break
> the tie in favour of the format with the exact match, but not offset
> any of the other scoring penalties we already have.
> 
> I have added a set of tests around these formats which will fail
> without this fix.
> 
> v2: Rework penalty system to scale penalty based on how different the
>     two formats are, and add new loss categories for them.
> 
> Signed-off-by: Philip Langdale <philipl at overt.org>
> ---
>  libavutil/pixdesc.c           |  38 +++++++++++-
>  libavutil/pixdesc.h           |  15 +++--
>  libavutil/tests/pixfmt_best.c | 111 ++++++++++++++++++++++++++++------
>  tests/ref/fate/pixfmt_best    |   2 +-
>  4 files changed, 139 insertions(+), 27 deletions(-)
> 
> diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
> index d7c6ebfdc4..143c17e64a 100644
> --- a/libavutil/pixdesc.c
> +++ b/libavutil/pixdesc.c

> @@ -3004,6 +3004,11 @@ static int get_pix_fmt_score(enum AVPixelFormat dst_pix_fmt,
>      if ((ret = get_pix_fmt_depth(&dst_min_depth, &dst_max_depth, dst_pix_fmt)) < 0)
>          return -3;
>  
> +    // Favour formats where bit depth exactly matches. If all other scoring is
> +    // equal, we'd rather use a lower bit depth that matches the source.
> +    if (src_min_depth != dst_min_depth || src_max_depth != dst_max_depth)
> +        score -= 64;

why 64 ?



> +
>      src_color = get_color_type(src_desc);
>      dst_color = get_color_type(dst_desc);
>      if (dst_pix_fmt == AV_PIX_FMT_PAL8)
> @@ -3013,9 +3018,16 @@ static int get_pix_fmt_score(enum AVPixelFormat dst_pix_fmt,
>  
>      for (i = 0; i < nb_components; i++) {
>          int depth_minus1 = (dst_pix_fmt == AV_PIX_FMT_PAL8) ? 7/nb_components : (dst_desc->comp[i].depth - 1);
> -        if (src_desc->comp[i].depth - 1 > depth_minus1 && (consider & FF_LOSS_DEPTH)) {
> +        int depth_delta = src_desc->comp[i].depth - 1 - depth_minus1;
> +        if (depth_delta > 0 && (consider & FF_LOSS_DEPTH)) {
>              loss |= FF_LOSS_DEPTH;
>              score -= 65536 >> depth_minus1;
> +        } else if (depth_delta < 0 && (consider & FF_LOSS_EXCESS_DEPTH)) {

> +            // Favour formats where bit depth exactly matches. If all other
> +            // scoring is equal, we'd rather use the bit depth that most closely
> +            // matches the source.

ok


> +            loss |= FF_LOSS_EXCESS_DEPTH;
> +            score -= 1 << -depth_delta;

but does that do that ?
a 1bpp -> 16bpp has a considerable -depth_delta

do we need the << at all ?


>          }
>      }
>  
> @@ -3024,10 +3036,12 @@ static int get_pix_fmt_score(enum AVPixelFormat dst_pix_fmt,
>              loss |= FF_LOSS_RESOLUTION;
>              score -= 256 << dst_desc->log2_chroma_w;
>          }
> +
>          if (dst_desc->log2_chroma_h > src_desc->log2_chroma_h) {
>              loss |= FF_LOSS_RESOLUTION;
>              score -= 256 << dst_desc->log2_chroma_h;
>          }
> +
>          // don't favor 422 over 420 if downsampling is needed, because 420 has much better support on the decoder side
>          if (dst_desc->log2_chroma_w == 1 && src_desc->log2_chroma_w == 0 &&
>              dst_desc->log2_chroma_h == 1 && src_desc->log2_chroma_h == 0 ) {
> @@ -3035,6 +3049,28 @@ static int get_pix_fmt_score(enum AVPixelFormat dst_pix_fmt,
>          }
>      }
>  
> +    if (consider & FF_LOSS_EXCESS_RESOLUTION) {
> +        // Favour formats where chroma subsampling exactly matches. If all other
> +        // scoring is equal, we'd rather use the subsampling that most closely
> +        // matches the source.
> +        if (dst_desc->log2_chroma_w < src_desc->log2_chroma_w) {
> +            loss |= FF_LOSS_EXCESS_RESOLUTION;
> +            score -= 32 << (src_desc->log2_chroma_w - dst_desc->log2_chroma_w);
> +        }
> +
> +        if (dst_desc->log2_chroma_h < src_desc->log2_chroma_h) {
> +            loss |= FF_LOSS_EXCESS_RESOLUTION;
> +            score -= 32 << (src_desc->log2_chroma_h - dst_desc->log2_chroma_h);
> +        }
> +

> +        // don't favour 411 over 420, because 420 has much better support on the
> +        // decoder side.
> +        if (dst_desc->log2_chroma_w == 1 && src_desc->log2_chroma_w == 2 &&
> +            dst_desc->log2_chroma_h == 1 && src_desc->log2_chroma_h == 2) {
> +                score += 128;
> +            }

I was thinking more generically that more symmetric resolution (difference)
may be preferrable

but this is fine too
theres an infinite number of ways to do this
its probably best to move forward and wait for some bug report to provide
real key points the heuristic can be tuned toward than inventing cases

also keeping the heuristic simple should be a goal

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

"I am not trying to be anyone's saviour, I'm trying to think about the
 future and not be sad" - Elon Musk

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20220914/6515d2b3/attachment.sig>


More information about the ffmpeg-devel mailing list