[FFmpeg-devel] [PATCH 31/92] Vulkan patchset part 1 - common code changes

James Almer jamrial at gmail.com
Tue Mar 14 13:51:00 EET 2023


On 3/14/2023 3:33 AM, Lynne wrote:
> The attached patchset is all the common code changes that my Vulkan patchset needs.
> 
> In total lines of code, this part has 425 additions and 131 deletions.
> Most of that is additions to HEVC parsing. Excluding them, the patchset is
> 200 lines of code added, which is manageable.
> 
> Apart from the parser changes, the following other changes have been
> made to the API:
> 
> AVHWAccel.free_frame_priv exists due to Vulkan's way of using VkImageView
> objects to wrap VkImage objects, which we need to free once they're no longer
> in use. Every other API uses the direct objects in decoding, but with Vulkan,
> they have to be represented by other objects.
> We also use it to free the slice offsets buffer.
> 
> AVHWAccel.flush exists due to Vulkan keeping decoder state, despite being
> stateless in theory. The decoder has to be notified of flushes in order to reset
> decoding slots and other data it needs, such as motion vectors and reference
> lists for AV1. Otherwise, inferring whether a flush has happened can be codec
> dependent, and hacky.
> 
> hwaccel_params_buf exists due to Vulkan's way of compiling SPS/PPS data
> into objects, making updating expensive. The change allows for hardware
> to only upload new parameters if they have been changed.
> It's insignificant for H264 and AV1, but HEVC's structures can reach 114
> megabytes of data that has to be uploaded, for a specially crafted input,
> which is enough to DDOS an ingest.
> The data is set and managed by the hwaccel, but does need to be synchronized
> between different decoding threads, which this patch performs.
> 
> Finally, the HWACCEL_CAP_THREAD_SAFE flag is added due to Vulkan being
> actually threadsafe, and requiring no serialization. It does work and it
> does actually make a difference, on average, it can increase performance
> by 20% for an average B-frame using HEVC stream, depending on the
> number of threads and the number of decode queues.
> While hardware decoders are fast in general, certain vendors such as AMD
> can choke up while playing 8k video, and this patch can significantly help
> increase throughput.
> 
> In context, the changes can be viewed here:
> https://github.com/cyanreg/FFmpeg/tree/vulkan
> 
> The rest of the whole patchset is either rewrites, filter code, or
> the actual hardware accel code.
> 
> The patchset will not be pushed standalone, but as part of the greater
> Vulkan patchset.
> 
> 31 patches attached.

[...]

> diff --git a/libavcodec/av1dec.c b/libavcodec/av1dec.c
> index a80e37e33f..5a3c51e94a 100644
> --- a/libavcodec/av1dec.c
> +++ b/libavcodec/av1dec.c
> @@ -591,6 +591,8 @@ static void av1_frame_unref(AVCodecContext *avctx, AV1Frame *f)
>      f->spatial_id = f->temporal_id = 0;
>      memset(f->skip_mode_frame_idx, 0,
>             2 * sizeof(uint8_t));
> +    memset(f->ref_order_hint, 0,
> +           7 * sizeof(uint8_t));
>      memset(&f->film_grain, 0, sizeof(f->film_grain));
>      f->coded_lossless = 0;
>  }
> @@ -633,6 +635,9 @@ static int av1_frame_ref(AVCodecContext *avctx, AV1Frame *dst, const AV1Frame *s
>      memcpy(dst->skip_mode_frame_idx,
>             src->skip_mode_frame_idx,
>             2 * sizeof(uint8_t));
> +    memcpy(dst->ref_order_hint,
> +           src->ref_order_hint,
> +           7 * sizeof(uint8_t));
>      memcpy(&dst->film_grain,
>             &src->film_grain,
>             sizeof(dst->film_grain));
> @@ -1267,6 +1272,10 @@ static int av1_decode_frame(AVCodecContext *avctx, AVFrame *frame,
>              s->cur_frame.spatial_id  = header->spatial_id;
>              s->cur_frame.temporal_id = header->temporal_id;
>  
> +            for (int i = 0; i < 7; i++)
> +                s->cur_frame.ref_order_hint[i] =
> +                s->raw_frame_header->ref_order_hint[s->raw_frame_header->ref_frame_idx[i]];

Why do you need this in cur_frame? It's not a derived value, and the 
AV1RawFrameHeader struct is accessible in all AVHWaccel callbacks.

And i think you should be looking at 
s->ref[s->raw_frame_header->ref_frame_idx[i]].raw_frame_header->order_hint 
instead, too, which is the decoder state vs the raw values in the 
current frame header (Although they should match in theory).

> +
>              if (avctx->hwaccel && s->cur_frame.f->buf[0]) {
>                  ret = avctx->hwaccel->start_frame(avctx, unit->data,




More information about the ffmpeg-devel mailing list