[FFmpeg-devel] [PATCH v6 4/9] avcodec: add cbs for h266/vvc
Mark Thompson
sw at jkqxz.net
Thu Feb 18 23:53:01 EET 2021
On 18/02/2021 15:14, Nuo Mi wrote:
> On Thu, Feb 18, 2021 at 8:04 AM Mark Thompson <sw at jkqxz.net> wrote:
>
>> On 17/02/2021 01:51, Nuo Mi wrote:
>>> ---
>>> configure | 2 +
>>> libavcodec/Makefile | 1 +
>>> libavcodec/cbs.c | 6 +
>>> libavcodec/cbs_h2645.c | 541 ++++-
>>> libavcodec/cbs_h266.h | 817 +++++++
>>> libavcodec/cbs_h266_syntax_template.c | 3006 +++++++++++++++++++++++++
>>> libavcodec/cbs_internal.h | 3 +-
>>> libavcodec/cbs_sei.c | 29 +
>>> 8 files changed, 4397 insertions(+), 8 deletions(-)
>>> create mode 100644 libavcodec/cbs_h266.h
>>> create mode 100644 libavcodec/cbs_h266_syntax_template.c
>>>
>>> ...
>>> @@ -1327,7 +1717,7 @@ static int
>> cbs_h2645_assemble_fragment(CodedBitstreamContext *ctx,
>>> {
>>> uint8_t *data;
>>> size_t max_size, dp, sp;
>>> - int err, i, zero_run;
>>> + int err, i, zero_run, h266_au_start = 0;
>>>
>>> for (i = 0; i < frag->nb_units; i++) {
>>> // Data should already all have been written when we get here.
>>> @@ -1343,7 +1733,10 @@ static int
>> cbs_h2645_assemble_fragment(CodedBitstreamContext *ctx,
>>> data = av_realloc(NULL, max_size + AV_INPUT_BUFFER_PADDING_SIZE);
>>> if (!data)
>>> return AVERROR(ENOMEM);
>>> -
>>> + if (ctx->codec->codec_id == AV_CODEC_ID_VVC) {
>>> + CodedBitstreamH266Context *h266 = ctx->priv_data;
>>> + h266_au_start = h266_is_au_start(h266, frag, ctx->log_ctx) > 0;
>>> + }
>>
>> It is unclear to me why this wants a special case.
>>
>> The current logic is that we are writing an AU, so the first NAL unit in
>> it is necessarily an AU start and subsequent NAL units are not?
>>
> H.266 AU contains one or more PU(3.105). One PU contains one coded picture.
> Only the first NAL unit of an AU needs the zero_byte(B.2.2)
> H.265 has no PU, each AU has one coded picture(3.1)
> H.266's PU is the H.265's AU. In vvc_parser, we split bitstream to PU.
Er, I don't think this is right: an AVPacket should contain all of the pictures for a single timestamp - a decoder can then output at most one picture from it, for the selected spatial layer.
Though I don't know how the container formats are defined here, for other codecs having multiple packets with the same timestamps causes ugly problems in muxing/demuxing which we avoid by saying you aren't allowed to do that.
(See AV1, which has the same features.)
>>> dp = 0;
>>> for (i = 0; i < frag->nb_units; i++) {
>>> CodedBitstreamUnit *unit = &frag->units[i];
>>> @@ -1356,7 +1749,7 @@ static int
>> cbs_h2645_assemble_fragment(CodedBitstreamContext *ctx,
>>> frag->data_bit_padding = unit->data_bit_padding;
>>> }
>>>
>>> - if (cbs_h2645_unit_requires_zero_byte(ctx->codec->codec_id,
>> unit->type, i)) {
>>> + if (cbs_h2645_unit_requires_zero_byte(ctx->codec->codec_id,
>> unit->type, i, h266_au_start)) {
>>> // zero_byte
>>> data[dp++] = 0;
>>> }
>>> @@ -1470,6 +1863,37 @@ static void cbs_h265_close(CodedBitstreamContext
>> *ctx)
>>> av_buffer_unref(&h265->pps_ref[i]);
>>> }
>>>
>>> ...
>>> +static int FUNC(ph)(CodedBitstreamContext *ctx, RWContext *rw,
>> H266RawPH *current)
>>> +{
>>> + int err;
>>> +
>>> + HEADER("Picture Header");
>>> +
>>> + CHECK(FUNC(nal_unit_header)(ctx, rw, ¤t->nal_unit_header,
>> VVC_PH_NUT));
>>> + CHECK(FUNC(picture_header)(ctx, rw, current));
>>> + CHECK(FUNC(rbsp_trailing_bits)(ctx, rw));
>>> + return 0;
>>> +}
>>> +
>>> +static int FUNC(slice_header)(CodedBitstreamContext *ctx, RWContext *rw,
>>> + H266RawSliceHeader *current)
>>> +{
>>> + CodedBitstreamH266Context *h266 = ctx->priv_data;
>>> + const H266RawSPS *sps;
>>> + const H266RawPPS *pps;
>>> + const H266RawPH *ph;
>>> + const H266RefPicLists *ref_pic_lists;
>>> + int err, i;
>>> + uint8_t nal_unit_type, qp_bd_offset;
>>> + uint16_t curr_subpic_idx;
>>> + uint16_t num_slices_in_subpic;
>>> +
>>> + HEADER("Slice Header");
>>> +
>>> + CHECK(FUNC(nal_unit_header)(ctx, rw, ¤t->nal_unit_header,
>> -1));
>>> +
>>> + flag(sh_picture_header_in_slice_header_flag);
>>> + if (current->sh_picture_header_in_slice_header_flag){
>>> + CHECK(FUNC(picture_header)(ctx, rw,
>> ¤t->sh_picture_header));
>>> + if (!h266->ph_ref) {
>>> + h266->ph_ref = av_buffer_allocz(sizeof(H266RawPH));
>>> + if (!h266->ph_ref)
>>> + return AVERROR(ENOMEM);
>>> + }
>>> + h266->ph = (H266RawPH*)h266->ph_ref->data;
>>> + memcpy(h266->ph, ¤t->sh_picture_header,
>> sizeof(H266RawPH));
>>
>> This memcpy() is naughty - if that was a ref to a previous picture header
>> then you've just overwritten the reference while the other unit still holds
>> it.
>>
>> If the slice is unit->content then unit->content_ref is a reference to a
>> superstructure of the picture header, avoiding the copy, but I'm not sure
>> it's the right way to do it.
>>
>> Is it right that the two possible options here are:
>> * Store a ref, which might be to a newly-allocated buffer or might be to a
>> previous unit.
>>
> * Store the whole structure in the private context.
>>
> ?
>>
>> If the structure isn't too large, then the second option feels much
>> simpler.
>>
>> If it is large such that holding references and avoiding copies is
>> beneficial (which is quite large, every ref entails a malloc/free
>> call-pair), then we probably want that ref to always be to the unit?
>>
> If we store the whole structure, it's hard to detect the picture header
> missing case.
I don't see how it's different? You test a pointer here, you would have a flag if the structure is stored.
>>> + }
>>> +
>>> + ph = h266->ph;
>>> + if (!ph) {
>>> + av_log(ctx->log_ctx, AV_LOG_ERROR, "Picture header not
>> available.\n");
>>> + return AVERROR_INVALIDDATA;
>>> + }
>>> ...
- Mark
More information about the ffmpeg-devel
mailing list