[FFmpeg-devel] [PATCH] Update HDR10+ metadata structure.

Mark Thompson sw at jkqxz.net
Thu Feb 20 01:50:47 EET 2020


On 10/02/2020 19:38, Mohammad Izadi wrote:
> On Sun, Feb 9, 2020 at 8:31 AM Mark Thompson <sw at jkqxz.net> wrote:
> 
>> On 07/02/2020 17:46, Mohammad Izadi wrote:
>>> From: Mohammad Izadi <izadi at google.com>
>>>
>>> Trying to read HDR10+ metadata from HEVC/SEI and pass it to packet side
>> data in the follow-up CLs.
>>> ---
>>>  libavutil/hdr_dynamic_metadata.c | 165 +++++++++++++++++++++++++++++++
>>>  libavutil/hdr_dynamic_metadata.h |  12 ++-
>>>  libavutil/version.h              |   2 +-
>>>  3 files changed, 177 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/libavutil/hdr_dynamic_metadata.c
>> b/libavutil/hdr_dynamic_metadata.c
>>> index 0fa1ee82de..a988bcd2d5 100644
>>> --- a/libavutil/hdr_dynamic_metadata.c
>>> +++ b/libavutil/hdr_dynamic_metadata.c
>>> ...
>>> +static const int32_t peak_luminance_den = 15;
>>> +static const int64_t rgb_den = 100000;
>>> +static const int32_t fraction_pixel_den = 1000;
>>> +static const int32_t knee_point_den = 4095;
>>> +static const int32_t bezier_anchor_den = 1023;
>>> +static const int32_t saturation_weight_den = 8;
>>
>> It would probably be clearer just to put these constants inline; there
>> isn't really any use to having the values standalone.
>>
> You are right. Actually, I am going to push the encode function in my next
> CL and the static vars will be shared between both encode and decode
> function.

My point is that separating the constants is actively unhelpful in matching the standard to the code.

>From the standard you can read in one paragraph:

"knee_point_x[ w ] –   specifies the x coordinate of the separation point between the linear part and thecurved part of the tone mapping function. The value of knee_point_x[ w ] shall be in the range of 0 to 4,095, inclusive, where 0 maps to 0 cd/m2 and the full range of 4,095 maps to the maximum of the scene maximum luminance and the target peak luminance in cd/m2."

But you've split that into two distinct locations by putting knee_point_den as a constant here and then the code below which refers to it when it would be clearer to just put the constant in the code in the same way that the standard does.

>>> +
>>>  AVDynamicHDRPlus *av_dynamic_hdr_plus_alloc(size_t *size)
>>>  {
>>>      AVDynamicHDRPlus *hdr_plus = av_mallocz(sizeof(AVDynamicHDRPlus));
>>> @@ -45,3 +54,159 @@ AVDynamicHDRPlus
>> *av_dynamic_hdr_plus_create_side_data(AVFrame *frame)
>>>
>>>      return (AVDynamicHDRPlus *)side_data->data;
>>>  }
>>> +
>>> +int av_dynamic_hdr_plus_decode(const uint8_t *data, size_t size,
>>> +                            AVDynamicHDRPlus *s)
>>
>> Why is this function being added to libavutil?  It looks like it's meant
>> for decoding UDR SEI messages only, so it should probably be in the
>> relevant place in libavcodec.
>>
> I have used this function in my local code to decode HDR10+ of SEI message
> (libavcodec) and also HDR10+ in matroska container (ibavformat). I will
> push them in my next CLs.

Um, so?  The parsing code can still go next to its uses in libavcodec.

>>> +{
>>> +    int w, i, j;
>>> +    GetBitContext gb;
>>> +    if (!data)
>>> +        return AVERROR_INVALIDDATA;
>>> +
>>> +    int ret = init_get_bits8(&gb, data, size);
>>> +    if (ret < 0)
>>> +        return AVERROR_INVALIDDATA;
>>> +
>>> +    if (get_bits_left(&gb) < 2)
>>> +        return AVERROR_INVALIDDATA;
>>> +    s->num_windows = get_bits(&gb, 2);
>>> +
>>> +    if (s->num_windows < 1 || s->num_windows > 3)
>>> +        return AVERROR_INVALIDDATA;
>>> +
>>> +    if (get_bits_left(&gb) < ((19 * 8 + 1) * (s->num_windows - 1)))
>>> +        return AVERROR_INVALIDDATA;
>>> +    for (w = 1; w < s->num_windows; w++) {
>>> +        s->params[w].window_upper_left_corner_x.num = get_bits(&gb, 16);
>>> +        s->params[w].window_upper_left_corner_y.num = get_bits(&gb, 16);
>>> +        s->params[w].window_lower_right_corner_x.num = get_bits(&gb,
>> 16);
>>> +        s->params[w].window_lower_right_corner_y.num = get_bits(&gb,
>> 16);
>>> +        // The corners are set to absolute coordinates here. They
>> should be
>>> +        // converted to the relative coordinates (in [0, 1]) in the
>> decoder.
>>> +        s->params[w].window_upper_left_corner_x.den = 1;
>>> +        s->params[w].window_upper_left_corner_y.den = 1;
>>> +        s->params[w].window_lower_right_corner_x.den = 1;
>>> +        s->params[w].window_lower_right_corner_y.den = 1;
>>> +
>>> +        s->params[w].center_of_ellipse_x = get_bits(&gb, 16);
>>> +        s->params[w].center_of_ellipse_y = get_bits(&gb, 16);
>>> +        s->params[w].rotation_angle = get_bits(&gb, 8);
>>
>> You're range-checking some fields here but not others.  Should everything
>> be verified against the constraints so that the comments on the structure
>> in the header file are actually true?
>>
> The intention is to only pass through HDR10+ from src file to dst file. The
> interpretation of the data is on the user/caller. I think it is better not
> to drop the metadata by verification. I did some range checking like row or
> col in order to simply avoid accessing out of memory (avoid indices that
> are out of array size).

If this is the intent then you probably need to modify all of the documentation on the AVDynamicHDRPlus to make clear that invalid values are acceptable in that structure and that other code will need to be able to handle them at least to the level of not-UB.

>>> +        s->params[w].semimajor_axis_internal_ellipse = get_bits(&gb,
>> 16);
>>> +        s->params[w].semimajor_axis_external_ellipse = get_bits(&gb,
>> 16);
>>> +        s->params[w].semiminor_axis_external_ellipse = get_bits(&gb,
>> 16);
>>> +        s->params[w].overlap_process_option = get_bits1(&gb);
>>> +    }
>>> +
>>> ...
>>> +
>>> +        if (get_bits_left(&gb) < 1)
>>> +            return AVERROR(EINVAL);
>>> +        s->params[w].color_saturation_mapping_flag = get_bits1(&gb);
>>> +        if (s->params[w].color_saturation_mapping_flag) {
>>> +            if (get_bits_left(&gb) < 6)
>>> +                return AVERROR(EINVAL);
>>> +            s->params[w].color_saturation_weight.num = get_bits(&gb, 6);
>>> +            s->params[w].color_saturation_weight.den =
>> saturation_weight_den;
>>> +        }
>>> +    }
>>> +
>>
>> It might be sensible to set the unused members of the structure to
>> something fixed rather than leaving them uninitialised, so that attempting
>> to copy the structure isn't UB.
>>
> If some members are not set by user and therefore they dont get a value in
> the decode function, they will never be accessed in the encode function or
> by user based on the HDR10+ logic. So it really doesn't matter to
> initialize them. Based on the logic, there is no way to interpret them.> Please note that country_code is always unused and should be removed, but
> we cannot as it breaks API. It would great if you allow to remove it. I
> wrote the code and I am sure no one used the code yet. I am the only user
> and you can check it in github.

You could mark it deprecated, and then maybe it could be removed on the next major version bump if there are no users.  (It can't actually be removed before then because it would change the layout of the structure.)

>>> +    return 0;
>>> +}
>>> diff --git a/libavutil/hdr_dynamic_metadata.h
>> b/libavutil/hdr_dynamic_metadata.h
>>> index 2d72de56ae..28c657481f 100644
>>> --- a/libavutil/hdr_dynamic_metadata.h
>>> +++ b/libavutil/hdr_dynamic_metadata.h
>>> @@ -265,7 +265,7 @@ typedef struct AVDynamicHDRPlus {
>>>
>>>      /**
>>>       * The nominal maximum display luminance of the targeted system
>> display,
>>> -     * in units of 0.0001 candelas per square metre. The value shall be
>> in
>>> +     * in units of 1 candelas per square metre. The value shall be in
>>
>> This was a error in the spec itself, I think?  It should probably be
>> isolated into its own commit with suitable references explaining what
>> happened.
>>
> I think it's wrong. It more makes sense now. The spec is updated in March
> 2019,
> https://www.atsc.org/wp-content/uploads/2018/02/A341S34-1-582r4-A341-Amendment-2094-40.pdf.
> I changed it based on the updated version.

Please turn this into a separate commit including the explanation that it was a typo in the standard.  (That can be applied separately without needing anything else using it.)

>>> ...>>> + * @param size The size of the user data registered ITU-T T.35 (SMPTE
>> ST 2094-40).
>>> + * @param s The decoded AVDynamicHDRPlus structure.
>>> + *
>>> + * @return 0 if succeed. Otherwise, returns the appropriate AVERROR.
>>> + */
>>> +int av_dynamic_hdr_plus_decode(const uint8_t *data, size_t size,
>> AVDynamicHDRPlus *s);
>>> +
>>> ...
>>
>> It would help to review if this were included in the decoder so that it
>> can actually be tested.
>>
> Can you please explain a bit more? :)
It is very hard to run a function to observe its behaviour in-use when nothing calls it.

Thanks,

- Mark


More information about the ffmpeg-devel mailing list