[FFmpeg-devel] [PATCH 02/10] hw_base_encode: move VAAPI SPS/PPS constructors to a shared file

Fri Sep 6 16:29:07 EEST 2024

On 06/09/2024 11:48, Tong Wu wrote:
> Lynne:
>> To: Tong Wu <wutong1208 at outlook.com>; FFmpeg development discussions and
>> patches <ffmpeg-devel at ffmpeg.org>
>> Subject: Re: [FFmpeg-devel] [PATCH 02/10] hw_base_encode: move VAAPI
>> SPS/PPS constructors to a shared file
>>
>> On 04/09/2024 16:09, Tong Wu wrote:
>>> Lynne:
>>>> Subject: [FFmpeg-devel] [PATCH 02/10] hw_base_encode: move VAAPI
>>>> SPS/PPS constructors to a shared file
>>>>
>>>> ---
>>>> libavcodec/Makefile              |   2 +-
>>>> libavcodec/hw_base_encode_h264.c | 265
>>>> +++++++++++++++++++++++++++++++  libavcodec/hw_base_encode_h264.h |
>>>> +++++++++++++++++++++++++++++++ 53
>>>> +++++++
>>>> libavcodec/vaapi_encode_h264.c   | 262 +++---------------------------
>>>> 4 files changed, 341 insertions(+), 241 deletions(-)  create mode
>>>> 100644 libavcodec/hw_base_encode_h264.c  create mode 100644
>>>> libavcodec/hw_base_encode_h264.h
>>>>
>>>> diff --git a/libavcodec/Makefile b/libavcodec/Makefile index
>>>> 3b4b8681f5..2e53dd723a 100644
>>>> --- a/libavcodec/Makefile
>>>> +++ b/libavcodec/Makefile
>>>> @@ -166,7 +166,7 @@ OBJS-$(CONFIG_STARTCODE)               += startcode.o
>>>> OBJS-$(CONFIG_TEXTUREDSP)              += texturedsp.o
>>>> OBJS-$(CONFIG_TEXTUREDSPENC)           += texturedspenc.o
>>>> OBJS-$(CONFIG_TPELDSP)                 += tpeldsp.o
>>>> -OBJS-$(CONFIG_VAAPI_ENCODE)            += vaapi_encode.o
>> hw_base_encode.o
>>>> +OBJS-$(CONFIG_VAAPI_ENCODE)            += vaapi_encode.o
>> hw_base_encode.o
>>>> hw_base_encode_h264.o
>>>> OBJS-$(CONFIG_AV1_AMF_ENCODER)         += amfenc_av1.o
>>>> OBJS-$(CONFIG_VC1DSP)                  += vc1dsp.o
>>>> OBJS-$(CONFIG_VIDEODSP)                += videodsp.o
>>>> diff --git a/libavcodec/hw_base_encode_h264.c
>>>> b/libavcodec/hw_base_encode_h264.c
>>>> new file mode 100644
>>>> index 0000000000..5c3957cddb
>>>> --- /dev/null
>>>> +++ b/libavcodec/hw_base_encode_h264.c
>>>> @@ -0,0 +1,265 @@
>>>> +/*
>>>> + * This file is part of FFmpeg.
>>>> + *
>>>> + * FFmpeg is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU Lesser General Public
>>>> + * License as published by the Free Software Foundation; either
>>>> + * version 2.1 of the License, or (at your option) any later version.
>>>> + *
>>>> + * FFmpeg is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> + * Lesser General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU Lesser General Public
>>>> + * License along with FFmpeg; if not, write to the Free Software
>>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>>>> +02110-1301 USA  */
>>>> +
>>>> +#include "hw_base_encode_h264.h"
>>>> +
>>>> +#include "h2645data.h"
>>>> +#include "h264_levels.h"
>>>> +
>>>> +#include "libavutil/pixdesc.h"
>>>> +
>>>> +int ff_hw_base_encode_init_params_h264(FFHWBaseEncodeContext
>> *base_ctx,
>>>> +                                       AVCodecContext *avctx,
>>>> +                                       FFHWBaseEncodeH264 *common,
>>>> +                                       FFHWBaseEncodeH264Opts *opts) {
>>>> +    H264RawSPS *sps = &common->raw_sps;
>>>> +    H264RawPPS *pps = &common->raw_pps;
>>>> +    const AVPixFmtDescriptor *desc;
>>>> +    int bit_depth;
>>>> +
>>>> +    memset(sps, 0, sizeof(*sps));
>>>> +    memset(pps, 0, sizeof(*pps));
>>>> +
>>>> +    desc = av_pix_fmt_desc_get(base_ctx->input_frames->sw_format);
>>>> +    av_assert0(desc);
>>>> +    if (desc->nb_components == 1 || desc->log2_chroma_w != 1 ||
>>>> + desc-
>>>>> log2_chroma_h != 1) {
>>>> +        av_log(avctx, AV_LOG_ERROR, "Chroma format of input pixel format "
>>>> +                "%s is not supported.\n", desc->name);
>>>> +        return AVERROR(EINVAL);
>>>> +    }
>>>> +    bit_depth = desc->comp[0].depth;
>>>> +
>>>> +    sps->nal_unit_header.nal_ref_idc   = 3;
>>>> +    sps->nal_unit_header.nal_unit_type = H264_NAL_SPS;
>>>> +
>>>> +    sps->profile_idc = avctx->profile & 0xff;
>>>> +
>>>> +    if (avctx->profile == AV_PROFILE_H264_CONSTRAINED_BASELINE ||
>>>> +        avctx->profile == AV_PROFILE_H264_MAIN)
>>>> +        sps->constraint_set1_flag = 1;
>>>> +
>>>> +    if (avctx->profile == AV_PROFILE_H264_HIGH || avctx->profile ==
>>>> AV_PROFILE_H264_HIGH_10)
>>>> +        sps->constraint_set3_flag = base_ctx->gop_size == 1;
>>>> +
>>>> +    if (avctx->profile == AV_PROFILE_H264_MAIN ||
>>>> +        avctx->profile == AV_PROFILE_H264_HIGH || avctx->profile ==
>>>> AV_PROFILE_H264_HIGH_10) {
>>>> +        sps->constraint_set4_flag = 1;
>>>> +        sps->constraint_set5_flag = base_ctx->b_per_p == 0;
>>>> +    }
>>>> +
>>>> +    if (base_ctx->gop_size == 1)
>>>> +        common->dpb_frames = 0;
>>>> +    else
>>>> +        common->dpb_frames = 1 + base_ctx->max_b_depth;
>>>> +
>>>> +    if (avctx->level != AV_LEVEL_UNKNOWN) {
>>>> +        sps->level_idc = avctx->level;
>>>> +    } else {
>>>> +        const H264LevelDescriptor *level;
>>>> +        int framerate;
>>>> +
>>>> +        if (avctx->framerate.num > 0 && avctx->framerate.den > 0)
>>>> +            framerate = avctx->framerate.num / avctx->framerate.den;
>>>> +        else
>>>> +            framerate = 0;
>>>> +
>>>> +        level = ff_h264_guess_level(sps->profile_idc,
>>>> +                                    opts->bit_rate,
>>>> +                                    framerate,
>>>> +                                    common->mb_width  * 16,
>>>> +                                    common->mb_height * 16,
>>>> +                                    common->dpb_frames);
>>>> +        if (level) {
>>>> +            av_log(avctx, AV_LOG_VERBOSE, "Using level %s.\n", level->name);
>>>> +            if (level->constraint_set3_flag)
>>>> +                sps->constraint_set3_flag = 1;
>>>> +            sps->level_idc = level->level_idc;
>>>> +        } else {
>>>> +            av_log(avctx, AV_LOG_WARNING, "Stream will not conform "
>>>> +                   "to any level: using level 6.2.\n");
>>>> +            sps->level_idc = 62;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    sps->seq_parameter_set_id = 0;
>>>> +    sps->chroma_format_idc    = 1;
>>>> +    sps->bit_depth_luma_minus8 = bit_depth - 8;
>>>> +    sps->bit_depth_chroma_minus8 = bit_depth - 8;
>>>> +
>>>> +    sps->log2_max_frame_num_minus4 = 4;
>>>> +    sps->pic_order_cnt_type        = base_ctx->max_b_depth ? 0 : 2;
>>>> +    if (sps->pic_order_cnt_type == 0) {
>>>> +        sps->log2_max_pic_order_cnt_lsb_minus4 = 4;
>>>> +    }
>>>> +
>>>> +    sps->max_num_ref_frames = common->dpb_frames;
>>>> +
>>>> +    sps->pic_width_in_mbs_minus1        = common->mb_width  - 1;
>>>> +    sps->pic_height_in_map_units_minus1 = common->mb_height - 1;
>>>> +
>>>> +    sps->frame_mbs_only_flag = 1;
>>>> +    sps->direct_8x8_inference_flag = 1;
>>>> +
>>>> +    if (avctx->width  != 16 * common->mb_width ||
>>>> +        avctx->height != 16 * common->mb_height) {
>>>> +        sps->frame_cropping_flag = 1;
>>>> +
>>>> +        sps->frame_crop_left_offset   = 0;
>>>> +        sps->frame_crop_right_offset  =
>>>> +            (16 * common->mb_width - avctx->width) / 2;
>>>> +        sps->frame_crop_top_offset    = 0;
>>>> +        sps->frame_crop_bottom_offset =
>>>> +            (16 * common->mb_height - avctx->height) / 2;
>>>> +    } else {
>>>> +        sps->frame_cropping_flag = 0;
>>>> +    }
>>>> +
>>>> +    sps->vui_parameters_present_flag = 1;
>>>> +
>>>> +    if (avctx->sample_aspect_ratio.num != 0 &&
>>>> +        avctx->sample_aspect_ratio.den != 0) {
>>>> +        int num, den, i;
>>>> +        av_reduce(&num, &den, avctx->sample_aspect_ratio.num,
>>>> +                  avctx->sample_aspect_ratio.den, 65535);
>>>> +        for (i = 0; i < FF_ARRAY_ELEMS(ff_h2645_pixel_aspect); i++) {
>>>> +            if (num == ff_h2645_pixel_aspect[i].num &&
>>>> +                den == ff_h2645_pixel_aspect[i].den) {
>>>> +                sps->vui.aspect_ratio_idc = i;
>>>> +                break;
>>>> +            }
>>>> +        }
>>>> +        if (i >= FF_ARRAY_ELEMS(ff_h2645_pixel_aspect)) {
>>>> +            sps->vui.aspect_ratio_idc = 255;
>>>> +            sps->vui.sar_width  = num;
>>>> +            sps->vui.sar_height = den;
>>>> +        }
>>>> +        sps->vui.aspect_ratio_info_present_flag = 1;
>>>> +    }
>>>> +
>>>> +    // Unspecified video format, from table E-2.
>>>> +    sps->vui.video_format             = 5;
>>>> +    sps->vui.video_full_range_flag    =
>>>> +        avctx->color_range == AVCOL_RANGE_JPEG;
>>>> +    sps->vui.colour_primaries         = avctx->color_primaries;
>>>> +    sps->vui.transfer_characteristics = avctx->color_trc;
>>>> +    sps->vui.matrix_coefficients      = avctx->colorspace;
>>>> +    if (avctx->color_primaries != AVCOL_PRI_UNSPECIFIED ||
>>>> +        avctx->color_trc       != AVCOL_TRC_UNSPECIFIED ||
>>>> +        avctx->colorspace      != AVCOL_SPC_UNSPECIFIED)
>>>> +        sps->vui.colour_description_present_flag = 1;
>>>> +    if (avctx->color_range     != AVCOL_RANGE_UNSPECIFIED ||
>>>> +        sps->vui.colour_description_present_flag)
>>>> +        sps->vui.video_signal_type_present_flag = 1;
>>>> +
>>>> +    if (avctx->chroma_sample_location != AVCHROMA_LOC_UNSPECIFIED) {
>>>> +        sps->vui.chroma_loc_info_present_flag = 1;
>>>> +        sps->vui.chroma_sample_loc_type_top_field    =
>>>> +        sps->vui.chroma_sample_loc_type_bottom_field =
>>>> +            avctx->chroma_sample_location - 1;
>>>> +    }
>>>> +
>>>> +    sps->vui.timing_info_present_flag = 1;
>>>> +    if (avctx->framerate.num > 0 && avctx->framerate.den > 0) {
>>>> +        sps->vui.num_units_in_tick = avctx->framerate.den;
>>>> +        sps->vui.time_scale        = 2 * avctx->framerate.num;
>>>> +        sps->vui.fixed_frame_rate_flag = 1;
>>>> +    } else {
>>>> +        sps->vui.num_units_in_tick = avctx->time_base.num;
>>>> +        sps->vui.time_scale        = 2 * avctx->time_base.den;
>>>> +        sps->vui.fixed_frame_rate_flag = 0;
>>>> +    }
>>>> +
>>>> +    if (opts->flags & FF_HW_H264_SEI_TIMING) {
>>>> +        H264RawHRD *hrd = &sps->vui.nal_hrd_parameters;
>>>> +        H264RawSEIBufferingPeriod *bp =
>>>> + &common->sei_buffering_period;
>>>> +
>>>> +        sps->vui.nal_hrd_parameters_present_flag = 1;
>>>> +
>>>> +        hrd->cpb_cnt_minus1 = 0;
>>>> +
>>>> +        // Try to scale these to a sensible range so that the
>>>> +        // golomb encode of the value is not overlong.
>>>> +        hrd->bit_rate_scale =
>>>> +            av_clip_uintp2(av_log2(opts->bit_rate) - 15 - 6, 4);
>>>> +        hrd->bit_rate_value_minus1[0] =
>>>> +            (opts->bit_rate >> hrd->bit_rate_scale + 6) - 1;
>>>> +
>>>> +        hrd->cpb_size_scale =
>>>> +            av_clip_uintp2(av_log2(opts->hrd_buffer_size) - 15 - 4, 4);
>>>> +        hrd->cpb_size_value_minus1[0] =
>>>> +            (opts->hrd_buffer_size >> hrd->cpb_size_scale + 4) - 1;
>>>> +
>>>> +        // CBR mode as defined for the HRD cannot be achieved without filler
>>>> +        // data, so this flag cannot be set even with VAAPI CBR modes.
>>>> +        hrd->cbr_flag[0] = 0;
>>>> +
>>>> +        hrd->initial_cpb_removal_delay_length_minus1 = 23;
>>>> +        hrd->cpb_removal_delay_length_minus1         = 23;
>>>> +        hrd->dpb_output_delay_length_minus1          = 7;
>>>> +        hrd->time_offset_length                      = 0;
>>>> +
>>>> +        bp->seq_parameter_set_id = sps->seq_parameter_set_id;
>>>> +
>>>> +        // This calculation can easily overflow 32 bits.
>>>> +        bp->nal.initial_cpb_removal_delay[0] = 90000 *
>>>> +            (uint64_t)opts->initial_buffer_fullness /
>>>> +            opts->hrd_buffer_size;
>>>> +        bp->nal.initial_cpb_removal_delay_offset[0] = 0;
>>>> +    } else {
>>>> +        sps->vui.nal_hrd_parameters_present_flag = 0;
>>>> +        sps->vui.low_delay_hrd_flag = 1 - sps->vui.fixed_frame_rate_flag;
>>>> +    }
>>>> +
>>>> +    sps->vui.bitstream_restriction_flag    = 1;
>>>> +    sps->vui.motion_vectors_over_pic_boundaries_flag = 1;
>>>> +    sps->vui.log2_max_mv_length_horizontal = 15;
>>>> +    sps->vui.log2_max_mv_length_vertical   = 15;
>>>> +    sps->vui.max_num_reorder_frames        = base_ctx->max_b_depth;
>>>> +    sps->vui.max_dec_frame_buffering       = base_ctx->max_b_depth + 1;
>>>> +
>>>> +    pps->nal_unit_header.nal_ref_idc = 3;
>>>> +    pps->nal_unit_header.nal_unit_type = H264_NAL_PPS;
>>>> +
>>>> +    pps->pic_parameter_set_id = 0;
>>>> +    pps->seq_parameter_set_id = 0;
>>>> +
>>>> +    pps->entropy_coding_mode_flag =
>>>> +        !(sps->profile_idc == AV_PROFILE_H264_BASELINE ||
>>>> +          sps->profile_idc == AV_PROFILE_H264_EXTENDED ||
>>>> +          sps->profile_idc == AV_PROFILE_H264_CAVLC_444);
>>>> +    if (!opts->cabac && pps->entropy_coding_mode_flag)
>>>> +        pps->entropy_coding_mode_flag = 0;
>>>> +
>>>> +    pps->num_ref_idx_l0_default_active_minus1 = 0;
>>>> +    pps->num_ref_idx_l1_default_active_minus1 = 0;
>>>> +
>>>> +    pps->pic_init_qp_minus26 = opts->fixed_qp_idr - 26;
>>>> +
>>>> +    if (sps->profile_idc == AV_PROFILE_H264_BASELINE ||
>>>> +        sps->profile_idc == AV_PROFILE_H264_EXTENDED ||
>>>> +        sps->profile_idc == AV_PROFILE_H264_MAIN) {
>>>> +        pps->more_rbsp_data = 0;
>>>> +    } else {
>>>> +        pps->more_rbsp_data = 1;
>>>> +
>>>> +        pps->transform_8x8_mode_flag = 1;
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> diff --git a/libavcodec/hw_base_encode_h264.h
>>>> b/libavcodec/hw_base_encode_h264.h
>>>> new file mode 100644
>>>> index 0000000000..d1bae8f36f
>>>> --- /dev/null
>>>> +++ b/libavcodec/hw_base_encode_h264.h
>>>> @@ -0,0 +1,53 @@
>>>> +/*
>>>> + * This file is part of FFmpeg.
>>>> + *
>>>> + * FFmpeg is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU Lesser General Public
>>>> + * License as published by the Free Software Foundation; either
>>>> + * version 2.1 of the License, or (at your option) any later version.
>>>> + *
>>>> + * FFmpeg is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> + * Lesser General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU Lesser General Public
>>>> + * License along with FFmpeg; if not, write to the Free Software
>>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>>>> +02110-1301 USA  */
>>>> +
>>>> +#ifndef AVCODEC_HW_BASE_ENCODE_H264_H #define
>>>> +AVCODEC_HW_BASE_ENCODE_H264_H
>>>> +
>>>> +#include "hw_base_encode.h"
>>>> +#include "cbs_h264.h"
>>>> +
>>>> +typedef struct FFHWBaseEncodeH264 {
>>>> +    H264RawSPS raw_sps;
>>>> +    H264RawPPS raw_pps;
>>>> +    H264RawAUD raw_aud;
>>>> +
>>>> +    H264RawSEIBufferingPeriod sei_buffering_period;
>>>> +
>>>> +    int dpb_frames;
>>>> +    int mb_width;
>>>> +    int mb_height;
>>>> +} FFHWBaseEncodeH264;
>>>> +
>>>> +typedef struct FFHWBaseEncodeH264Opts {
>>>> +    int flags;
>>>> +#define FF_HW_H264_SEI_TIMING (1 << 0)
>>>> +
>>>> +    int64_t bit_rate;
>>>> +    int cabac;
>>>> +    int fixed_qp_idr;
>>>> +    uint64_t hrd_buffer_size;
>>>> +    uint64_t initial_buffer_fullness; } FFHWBaseEncodeH264Opts;
>>>> +
>>>> +int ff_hw_base_encode_init_params_h264(FFHWBaseEncodeContext
>> *base_ctx,
>>>> +                                       AVCodecContext *avctx,
>>>> +                                       FFHWBaseEncodeH264 *common,
>>>> +                                       FFHWBaseEncodeH264Opts
>>>> +*opts);
>>>> +
>>>> +#endif /* AVCODEC_HW_BASE_ENCODE_H264_H */
>>>> diff --git a/libavcodec/vaapi_encode_h264.c
>>>> b/libavcodec/vaapi_encode_h264.c index 35d3967766..443a569104 100644
>>>> --- a/libavcodec/vaapi_encode_h264.c
>>>> +++ b/libavcodec/vaapi_encode_h264.c
>>>> @@ -33,6 +33,7 @@
>>>> #include "cbs_h264.h"
>>>> #include "codec_internal.h"
>>>> #include "h264.h"
>>>> +#include "hw_base_encode_h264.h"
>>>> #include "h264_levels.h"
>>>> #include "h2645data.h"
>>>> #include "vaapi_encode.h"
>>>> @@ -67,6 +68,7 @@ typedef struct VAAPIEncodeH264Picture {
>>>>
>>>> typedef struct VAAPIEncodeH264Context {
>>>>       VAAPIEncodeContext common;
>>>> +    FFHWBaseEncodeH264 units;
>>>>
>>>>       // User options.
>>>>       int qp;
>>>> @@ -85,15 +87,11 @@ typedef struct VAAPIEncodeH264Context {
>>>>       int fixed_qp_p;
>>>>       int fixed_qp_b;
>>>>
>>>> -    int dpb_frames;
>>>> -
>>>>       // Writer structures.
>>>>       CodedBitstreamContext *cbc;
>>>>       CodedBitstreamFragment current_access_unit;
>>>>
>>>>       H264RawAUD   raw_aud;
>>>> -    H264RawSPS   raw_sps;
>>>> -    H264RawPPS   raw_pps;
>>>>       H264RawSlice raw_slice;
>>>>
>>>>       H264RawSEIBufferingPeriod      sei_buffering_period;
>>>> @@ -168,11 +166,11 @@ static int
>>>> vaapi_encode_h264_write_sequence_header(AVCodecContext *avctx,
>>>>           priv->aud_needed = 0;
>>>>       }
>>>>
>>>> -    err = vaapi_encode_h264_add_nal(avctx, au, &priv->raw_sps);
>>>> +    err = vaapi_encode_h264_add_nal(avctx, au,
>>>> + &priv->units.raw_sps);
>>>>       if (err < 0)
>>>>           goto fail;
>>>>
>>>> -    err = vaapi_encode_h264_add_nal(avctx, au, &priv->raw_pps);
>>>> +    err = vaapi_encode_h264_add_nal(avctx, au,
>>>> + &priv->units.raw_pps);
>>>>       if (err < 0)
>>>>           goto fail;
>>>>
>>>> @@ -298,240 +296,24 @@ static int
>>>> vaapi_encode_h264_init_sequence_params(AVCodecContext *avctx)
>>>>       FFHWBaseEncodeContext        *base_ctx = avctx->priv_data;
>>>>       VAAPIEncodeContext                *ctx = avctx->priv_data;
>>>>       VAAPIEncodeH264Context           *priv = avctx->priv_data;
>>>> -    H264RawSPS                        *sps = &priv->raw_sps;
>>>> -    H264RawPPS                        *pps = &priv->raw_pps;
>>>> +    H264RawSPS                        *sps = &priv->units.raw_sps;
>>>> +    H264RawPPS                        *pps = &priv->units.raw_pps;
>>>>       VAEncSequenceParameterBufferH264 *vseq = ctx-
>>> codec_sequence_params;
>>>>       VAEncPictureParameterBufferH264  *vpic = ctx->codec_picture_params;
>>>> -    const AVPixFmtDescriptor *desc;
>>>> -    int bit_depth;
>>>> -
>>>> -    memset(sps, 0, sizeof(*sps));
>>>> -    memset(pps, 0, sizeof(*pps));
>>>> -
>>>> -    desc = av_pix_fmt_desc_get(base_ctx->input_frames->sw_format);
>>>> -    av_assert0(desc);
>>>> -    if (desc->nb_components == 1 || desc->log2_chroma_w != 1 || desc-
>>>>> log2_chroma_h != 1) {
>>>> -        av_log(avctx, AV_LOG_ERROR, "Chroma format of input pixel format "
>>>> -                "%s is not supported.\n", desc->name);
>>>> -        return AVERROR(EINVAL);
>>>> -    }
>>>> -    bit_depth = desc->comp[0].depth;
>>>> -
>>>> -    sps->nal_unit_header.nal_ref_idc   = 3;
>>>> -    sps->nal_unit_header.nal_unit_type = H264_NAL_SPS;
>>>> -
>>>> -    sps->profile_idc = avctx->profile & 0xff;
>>>> -
>>>> -    if (avctx->profile == AV_PROFILE_H264_CONSTRAINED_BASELINE ||
>>>> -        avctx->profile == AV_PROFILE_H264_MAIN)
>>>> -        sps->constraint_set1_flag = 1;
>>>> -
>>>> -    if (avctx->profile == AV_PROFILE_H264_HIGH || avctx->profile ==
>>>> AV_PROFILE_H264_HIGH_10)
>>>> -        sps->constraint_set3_flag = base_ctx->gop_size == 1;
>>>> -
>>>> -    if (avctx->profile == AV_PROFILE_H264_MAIN ||
>>>> -        avctx->profile == AV_PROFILE_H264_HIGH || avctx->profile ==
>>>> AV_PROFILE_H264_HIGH_10) {
>>>> -        sps->constraint_set4_flag = 1;
>>>> -        sps->constraint_set5_flag = base_ctx->b_per_p == 0;
>>>> -    }
>>>> -
>>>> -    if (base_ctx->gop_size == 1)
>>>> -        priv->dpb_frames = 0;
>>>> -    else
>>>> -        priv->dpb_frames = 1 + base_ctx->max_b_depth;
>>>> -
>>>> -    if (avctx->level != AV_LEVEL_UNKNOWN) {
>>>> -        sps->level_idc = avctx->level;
>>>> -    } else {
>>>> -        const H264LevelDescriptor *level;
>>>> -        int framerate;
>>>> -
>>>> -        if (avctx->framerate.num > 0 && avctx->framerate.den > 0)
>>>> -            framerate = avctx->framerate.num / avctx->framerate.den;
>>>> -        else
>>>> -            framerate = 0;
>>>> -
>>>> -        level = ff_h264_guess_level(sps->profile_idc,
>>>> -                                    avctx->bit_rate,
>>>> -                                    framerate,
>>>> -                                    priv->mb_width  * 16,
>>>> -                                    priv->mb_height * 16,
>>>> -                                    priv->dpb_frames);
>>>> -        if (level) {
>>>> -            av_log(avctx, AV_LOG_VERBOSE, "Using level %s.\n", level->name);
>>>> -            if (level->constraint_set3_flag)
>>>> -                sps->constraint_set3_flag = 1;
>>>> -            sps->level_idc = level->level_idc;
>>>> -        } else {
>>>> -            av_log(avctx, AV_LOG_WARNING, "Stream will not conform "
>>>> -                   "to any level: using level 6.2.\n");
>>>> -            sps->level_idc = 62;
>>>> -        }
>>>> -    }
>>>> -
>>>> -    sps->seq_parameter_set_id = 0;
>>>> -    sps->chroma_format_idc    = 1;
>>>> -    sps->bit_depth_luma_minus8 = bit_depth - 8;
>>>> -    sps->bit_depth_chroma_minus8 = bit_depth - 8;
>>>> -
>>>> -    sps->log2_max_frame_num_minus4 = 4;
>>>> -    sps->pic_order_cnt_type        = base_ctx->max_b_depth ? 0 : 2;
>>>> -    if (sps->pic_order_cnt_type == 0) {
>>>> -        sps->log2_max_pic_order_cnt_lsb_minus4 = 4;
>>>> -    }
>>>> -
>>>> -    sps->max_num_ref_frames = priv->dpb_frames;
>>>> -
>>>> -    sps->pic_width_in_mbs_minus1        = priv->mb_width  - 1;
>>>> -    sps->pic_height_in_map_units_minus1 = priv->mb_height - 1;
>>>> -
>>>> -    sps->frame_mbs_only_flag = 1;
>>>> -    sps->direct_8x8_inference_flag = 1;
>>>> -
>>>> -    if (avctx->width  != 16 * priv->mb_width ||
>>>> -        avctx->height != 16 * priv->mb_height) {
>>>> -        sps->frame_cropping_flag = 1;
>>>> -
>>>> -        sps->frame_crop_left_offset   = 0;
>>>> -        sps->frame_crop_right_offset  =
>>>> -            (16 * priv->mb_width - avctx->width) / 2;
>>>> -        sps->frame_crop_top_offset    = 0;
>>>> -        sps->frame_crop_bottom_offset =
>>>> -            (16 * priv->mb_height - avctx->height) / 2;
>>>> -    } else {
>>>> -        sps->frame_cropping_flag = 0;
>>>> -    }
>>>> -
>>>> -    sps->vui_parameters_present_flag = 1;
>>>> -
>>>> -    if (avctx->sample_aspect_ratio.num != 0 &&
>>>> -        avctx->sample_aspect_ratio.den != 0) {
>>>> -        int num, den, i;
>>>> -        av_reduce(&num, &den, avctx->sample_aspect_ratio.num,
>>>> -                  avctx->sample_aspect_ratio.den, 65535);
>>>> -        for (i = 0; i < FF_ARRAY_ELEMS(ff_h2645_pixel_aspect); i++) {
>>>> -            if (num == ff_h2645_pixel_aspect[i].num &&
>>>> -                den == ff_h2645_pixel_aspect[i].den) {
>>>> -                sps->vui.aspect_ratio_idc = i;
>>>> -                break;
>>>> -            }
>>>> -        }
>>>> -        if (i >= FF_ARRAY_ELEMS(ff_h2645_pixel_aspect)) {
>>>> -            sps->vui.aspect_ratio_idc = 255;
>>>> -            sps->vui.sar_width  = num;
>>>> -            sps->vui.sar_height = den;
>>>> -        }
>>>> -        sps->vui.aspect_ratio_info_present_flag = 1;
>>>> -    }
>>>> -
>>>> -    // Unspecified video format, from table E-2.
>>>> -    sps->vui.video_format             = 5;
>>>> -    sps->vui.video_full_range_flag    =
>>>> -        avctx->color_range == AVCOL_RANGE_JPEG;
>>>> -    sps->vui.colour_primaries         = avctx->color_primaries;
>>>> -    sps->vui.transfer_characteristics = avctx->color_trc;
>>>> -    sps->vui.matrix_coefficients      = avctx->colorspace;
>>>> -    if (avctx->color_primaries != AVCOL_PRI_UNSPECIFIED ||
>>>> -        avctx->color_trc       != AVCOL_TRC_UNSPECIFIED ||
>>>> -        avctx->colorspace      != AVCOL_SPC_UNSPECIFIED)
>>>> -        sps->vui.colour_description_present_flag = 1;
>>>> -    if (avctx->color_range     != AVCOL_RANGE_UNSPECIFIED ||
>>>> -        sps->vui.colour_description_present_flag)
>>>> -        sps->vui.video_signal_type_present_flag = 1;
>>>> -
>>>> -    if (avctx->chroma_sample_location != AVCHROMA_LOC_UNSPECIFIED) {
>>>> -        sps->vui.chroma_loc_info_present_flag = 1;
>>>> -        sps->vui.chroma_sample_loc_type_top_field    =
>>>> -        sps->vui.chroma_sample_loc_type_bottom_field =
>>>> -            avctx->chroma_sample_location - 1;
>>>> -    }
>>>> -
>>>> -    sps->vui.timing_info_present_flag = 1;
>>>> -    if (avctx->framerate.num > 0 && avctx->framerate.den > 0) {
>>>> -        sps->vui.num_units_in_tick = avctx->framerate.den;
>>>> -        sps->vui.time_scale        = 2 * avctx->framerate.num;
>>>> -        sps->vui.fixed_frame_rate_flag = 1;
>>>> -    } else {
>>>> -        sps->vui.num_units_in_tick = avctx->time_base.num;
>>>> -        sps->vui.time_scale        = 2 * avctx->time_base.den;
>>>> -        sps->vui.fixed_frame_rate_flag = 0;
>>>> -    }
>>>> -
>>>> -    if (priv->sei & SEI_TIMING) {
>>>> -        H264RawHRD *hrd = &sps->vui.nal_hrd_parameters;
>>>> -        H264RawSEIBufferingPeriod *bp = &priv->sei_buffering_period;
>>>> -
>>>> -        sps->vui.nal_hrd_parameters_present_flag = 1;
>>>> -
>>>> -        hrd->cpb_cnt_minus1 = 0;
>>>> -
>>>> -        // Try to scale these to a sensible range so that the
>>>> -        // golomb encode of the value is not overlong.
>>>> -        hrd->bit_rate_scale =
>>>> -            av_clip_uintp2(av_log2(ctx->va_bit_rate) - 15 - 6, 4);
>>>> -        hrd->bit_rate_value_minus1[0] =
>>>> -            (ctx->va_bit_rate >> hrd->bit_rate_scale + 6) - 1;
>>>> -
>>>> -        hrd->cpb_size_scale =
>>>> -            av_clip_uintp2(av_log2(ctx->hrd_params.buffer_size) - 15 - 4, 4);
>>>> -        hrd->cpb_size_value_minus1[0] =
>>>> -            (ctx->hrd_params.buffer_size >> hrd->cpb_size_scale + 4) - 1;
>>>> -
>>>> -        // CBR mode as defined for the HRD cannot be achieved without filler
>>>> -        // data, so this flag cannot be set even with VAAPI CBR modes.
>>>> -        hrd->cbr_flag[0] = 0;
>>>>
>>>> -        hrd->initial_cpb_removal_delay_length_minus1 = 23;
>>>> -        hrd->cpb_removal_delay_length_minus1         = 23;
>>>> -        hrd->dpb_output_delay_length_minus1          = 7;
>>>> -        hrd->time_offset_length                      = 0;
>>>> -
>>>> -        bp->seq_parameter_set_id = sps->seq_parameter_set_id;
>>>> -
>>>> -        // This calculation can easily overflow 32 bits.
>>>> -        bp->nal.initial_cpb_removal_delay[0] = 90000 *
>>>> -            (uint64_t)ctx->hrd_params.initial_buffer_fullness /
>>>> -            ctx->hrd_params.buffer_size;
>>>> -        bp->nal.initial_cpb_removal_delay_offset[0] = 0;
>>>> -    } else {
>>>> -        sps->vui.nal_hrd_parameters_present_flag = 0;
>>>> -        sps->vui.low_delay_hrd_flag = 1 - sps->vui.fixed_frame_rate_flag;
>>>> -    }
>>>> -
>>>> -    sps->vui.bitstream_restriction_flag    = 1;
>>>> -    sps->vui.motion_vectors_over_pic_boundaries_flag = 1;
>>>> -    sps->vui.log2_max_mv_length_horizontal = 15;
>>>> -    sps->vui.log2_max_mv_length_vertical   = 15;
>>>> -    sps->vui.max_num_reorder_frames        = base_ctx->max_b_depth;
>>>> -    sps->vui.max_dec_frame_buffering       = base_ctx->max_b_depth + 1;
>>>> -
>>>> -    pps->nal_unit_header.nal_ref_idc = 3;
>>>> -    pps->nal_unit_header.nal_unit_type = H264_NAL_PPS;
>>>> -
>>>> -    pps->pic_parameter_set_id = 0;
>>>> -    pps->seq_parameter_set_id = 0;
>>>> -
>>>> -    pps->entropy_coding_mode_flag =
>>>> -        !(sps->profile_idc == AV_PROFILE_H264_BASELINE ||
>>>> -          sps->profile_idc == AV_PROFILE_H264_EXTENDED ||
>>>> -          sps->profile_idc == AV_PROFILE_H264_CAVLC_444);
>>>> -    if (!priv->coder && pps->entropy_coding_mode_flag)
>>>> -        pps->entropy_coding_mode_flag = 0;
>>>> -
>>>> -    pps->num_ref_idx_l0_default_active_minus1 = 0;
>>>> -    pps->num_ref_idx_l1_default_active_minus1 = 0;
>>>> -
>>>> -    pps->pic_init_qp_minus26 = priv->fixed_qp_idr - 26;
>>>> -
>>>> -    if (sps->profile_idc == AV_PROFILE_H264_BASELINE ||
>>>> -        sps->profile_idc == AV_PROFILE_H264_EXTENDED ||
>>>> -        sps->profile_idc == AV_PROFILE_H264_MAIN) {
>>>> -        pps->more_rbsp_data = 0;
>>>> -    } else {
>>>> -        pps->more_rbsp_data = 1;
>>>> +    FFHWBaseEncodeH264Opts unit_opts = {
>>>> +        .flags = (priv->sei & SEI_TIMING) ? FF_HW_H264_SEI_TIMING : 0,
>>>> +        .cabac = priv->coder,
>>>> +        .hrd_buffer_size = ctx->hrd_params.buffer_size,
>>>> +        .fixed_qp_idr = priv->fixed_qp_idr,
>>>> +        .initial_buffer_fullness = ctx->hrd_params.initial_buffer_fullness,
>>>> +        .bit_rate = ctx->va_bit_rate,
>>>> +    };
>>>>
>>>> -        pps->transform_8x8_mode_flag = 1;
>>>> -    }
>>>> +    int err = ff_hw_base_encode_init_params_h264(base_ctx, avctx,
>>>> +                                                 &priv->units, &unit_opts);
>>>> +    if (err < 0)
>>>> +        return err;
>>>>
>>>>       *vseq = (VAEncSequenceParameterBufferH264) {
>>>>           .seq_parameter_set_id = sps->seq_parameter_set_id, @@ -660,7
>>>> +442,7 @@ static int
>> vaapi_encode_h264_init_picture_params(AVCodecContext *avctx,
>>>>           }
>>>>       }
>>>>       hpic->pic_order_cnt = pic->display_order - hpic->last_idr_frame;
>>>> -    if (priv->raw_sps.pic_order_cnt_type == 2) {
>>>> +    if (priv->units.raw_sps.pic_order_cnt_type == 2) {
>>>>           hpic->pic_order_cnt *= 2;
>>>>       }
>>>>
>>>> @@ -870,8 +652,8 @@ static int
>>>> vaapi_encode_h264_init_slice_params(AVCodecContext *avctx,
>>>>       VAAPIEncodePicture         *vaapi_pic = pic->priv;
>>>>       VAAPIEncodeH264Picture          *hpic = pic->codec_priv;
>>>>       FFHWBaseEncodePicture           *prev = pic->prev;
>>>> -    H264RawSPS                       *sps = &priv->raw_sps;
>>>> -    H264RawPPS                       *pps = &priv->raw_pps;
>>>> +    H264RawSPS                       *sps = &priv->units.raw_sps;
>>>> +    H264RawPPS                       *pps = &priv->units.raw_pps;
>>>>       H264RawSliceHeader                *sh = &priv->raw_slice.header;
>>>>       VAEncPictureParameterBufferH264 *vpic = vaapi_pic-
>>> codec_picture_params;
>>>>       VAEncSliceParameterBufferH264 *vslice =
>>>> slice->codec_slice_params; @@ -
>>>> 923,7 +705,7 @@ static int
>>>> vaapi_encode_h264_init_slice_params(AVCodecContext *avctx,
>>>>                   ++keep;
>>>>               }
>>>>           }
>>>> -        av_assert0(keep <= priv->dpb_frames);
>>>> +        av_assert0(keep <= priv->units.dpb_frames);
>>>>
>>>>           if (discard == 0) {
>>>>               sh->adaptive_ref_pic_marking_mode_flag = 0;
>>>> --
>>>
>>> Are you planning to move more codecs to shared file such as hevc? Since some
>> flags are based on the query result of hardware specific API, d3d12 may not align
>> with the constructors in shared file.
>>
>> Probably, for HEVC but later on. Not likely to do it for AV1 as its much shorter and
>> easier to construct yourself.
>>
>> Its the same story with Vulkan, where some unit features may be overridden. I
>> solve this issue by running *this* common function to construct optimal units,
>> converting them to Vulkan units, letting Vulkan write out its SPS/PPS unit
>> bitstream, *decoding* the output units back to CBS native units, copying the
>> overridden parts that matter (such as CABAC, loopfilter bits and so on) from the
>> Vulkan units, and then using CBS to encode the final units.
> 
> For D3D12 HEVC and H264 (AV1 not included), it is fully user's responsibility to write out and generate the SPS/PPS coded units, which do not even go through driver. Driver only cares about the D3D12 structures which are filled out by users. That looks simpler than Vulkan.
> 
>>
>> Thankfully, the CBS framework makes this much easier than it sounds, and since all
>> of this only occurs during initialization (you can append other units such as AUD
>> during runthime), its essentially free.
>>
>> You should really consider this for D3D12. I discovered issues in every single vulkan
>> driver when writing out units, so you shouldn't trust them either and write your
>> own.
>>
>> Also, Vulkan requires you to use filler units to pad the unit bitstream to a specific
>> alignment, which you cannot do with using the native Vulkan driver written units,
>> as there's no syntax for fillers. This requires you to copy the written slices, which is
>> IMO a hundred times worse than just writing out your own units.
>>
>> We can discuss HEVC later on, for now what do you think of this patch?
> 
> 
> What I meant was in both D3D12 HEVC and VAAPI HEVC we enabled some of the flags in PPS according to the D3D12/VAAPI specific structures queried from driver, which might be unique and hard to share. Anyways we could discuss later. I don't see we have such issue in h264 so I guess this patch is ok for me.

You can always override them individually, if you want to. This merely 
fills out a structure.

RE: Patch getting lost, its my fault. Patch 1 is indeed identical to the 
previously sent patch.

Thanks for reviewing, I'll push the patchset tomorrow.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xA2FEA5F03F034464.asc
Type: application/pgp-keys
Size: 624 bytes
Desc: OpenPGP public key
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240906/a230c27e/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240906/a230c27e/attachment.sig>