[FFmpeg-devel] [PATCH 3/6] lavc/vvc: Store MIP information over entire CU area

Tue Dec 3 20:09:06 EET 2024

Hi,

Thanks for your review.

On 03/12/2024 10:04, Nuo Mi wrote:
> Hi Frank,
> Thank you for the patch
> 
> On Sat, Nov 30, 2024 at 8:13 PM Frank Plowman <post at frankplowman.com> wrote:
> 
>> On 30/11/2024 06:46, Nuo Mi wrote:
>>> On Fri, Nov 29, 2024 at 6:19 AM Frank Plowman <post at frankplowman.com>
>> wrote:
>>>
>>>> Previously, the code only stored the MIP mode and transpose flag in the
>>>> relevant tables at the top-left corner of the CU.  This information ends
>>>> up being retrieved in ff_vvc_intra_pred_* not based on the CU position
>>>> but instead the transform unit position (specifically, using the x0 and
>>>> y0 from get_luma_predict_unit).  There might be multiple transform units
>>>> in a CU, hence the top-left corner of the transform unit might not
>>>> coincide with the top-left corner of the CU.  Consequently, we need to
>>>> store the MIP information at all positions in the CU, not only its
>>>> top-left corner, as we already do for the MIP flag.
>>>>
>>>> Signed-off-by: Frank Plowman <post at frankplowman.com>
>>>> ---
>>>>  libavcodec/vvc/ctu.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/libavcodec/vvc/ctu.c b/libavcodec/vvc/ctu.c
>>>> index 1e06119cfd..0030938cf5 100644
>>>> --- a/libavcodec/vvc/ctu.c
>>>> +++ b/libavcodec/vvc/ctu.c
>>>> @@ -975,8 +975,8 @@ static void intra_luma_pred_modes(VVCLocalContext
>> *lc)
>>>>              for (int y = 0; y < (cb_height>>log2_min_cb_size); y++) {
>>>>                  int width = cb_width>>log2_min_cb_size;
>>>>                  memset(&fc->tab.imf[x],  cu->intra_mip_flag, width);
>>>> -                fc->tab.imtf[x] = intra_mip_transposed_flag;
>>>> -                fc->tab.imm[x]  = intra_mip_mode;
>>>> +                memset(&fc->tab.imtf[x], intra_mip_transposed_flag,
>>>> width);
>>>> +                memset(&fc->tab.imm[x], intra_mip_mode, width);
>>>
>>>  intra_mip_mode is 4 bits, 2 flags are 2 bits. maybe we can use a uint8_t
>>> for 3 fields,
>>> We only need 1 memset and save 2/3 memory.
>>
>> I've implemented this (patch attached, to be applied atop the set), but
>> it's not as straightforward as it may seem.  In particular, because the
>> tables are read directly from when determining which CABAC context to
>> use for these flags, we have to add quite a lot of extra code in cabac.c
>> to support this special case where the MIP information is a bit field.
>> In my implementation, this was done by adding this coerce_to_bool
>> parameter to get_inc and get_top.  This does actually save a moderate
>> amount of memory though, ~1MB for 4K and ~256kB for 1080p.
>>
> A coding block can be as small as 4x4, so for a 1080p frame, the memory
> required is approximately  2*1920*1080/(4*4) ~= 256 kB.
> However, with a maximum delay of 16 frames, the total memory usage will be
> 256kB * 16=4 MB.
> 
> In your patch, coerce_to_bool is set to 1 only when we are in
> ff_vvc_intra_mip_flag.
> How about defining get_mip_inc like this? This way, we can avoid changing
> get_inc
> 
> static av_always_inline
> uint8_t get_mip_inc (VVCLocalContext *lc, const uint8_t *ctx)
> {
>     uint8_t left = 0, top = 0;
>     get_left_top(lc, &left, &top, lc->cu->x0, lc->cu->y0, ctx, ctx);
>     return !!left + !!top;
> }

What about instead taking a function pointer as an argument to
get_inc/get_left_top which can be used to modify the value read from the
table before using it, or NULL to apply no such modification?  At the
moment, this would only be used for the MIP flag, but if we wanted to
pack other tables in order to save more memory, we could then reuse that
argument in other ways and avoid duplicating the functions for every
case where the in-memory representation differs.  It's not that bad for
get_inc, but if we ever wanted to change the representation of something
which uses get_left_top directly, we would have to duplicate a fair
amount of code.

> 
> BTW: Using pack_mip_info/unpack_mip_info might be shorter and more concise
> than structure_mip_info/destructure_mip_info.

Agreed, will do.

> 
>>
>>>
>>>>
>>>
>>>                  x += pps->min_cb_width;
>>>>              }
>>>>              cu->intra_pred_mode_y = intra_mip_mode;
>>>> --
>>>> 2.47.0
>>>>
>>
>>
>> --
>> Frank
>>

-- 
Frank