[FFmpeg-devel] [PATCH 1/3] avutils/hwcontext: add derive-device function which searches for existing devices in both directions

Mark Thompson sw at jkqxz.net
Mon May 2 01:00:59 EEST 2022



On 30/04/2022 23:42, Soft Works wrote:
> 
> 
>> -----Original Message-----
>> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Mark
>> Thompson
>> Sent: Saturday, April 30, 2022 11:39 PM
>> To: ffmpeg-devel at ffmpeg.org
>> Subject: Re: [FFmpeg-devel] [PATCH 1/3] avutils/hwcontext: add derive-
>> device function which searches for existing devices in both directions
>>
>> On 30/04/2022 21:07, softworkz wrote:
>>> From: softworkz <softworkz at hotmail.com>
>>>
>>> The test /libavutil/tests/hwdevice checks that when deriving a
>> device
>>> from a source device and then deriving back to the type of the
>> source
>>> device, the result is matching the original source device, i.e. the
>>> derivation mechanism doesn't create a new device in this case.
>>>
>>> Previously, this test was usually passed, but only due to two
>> different
>>> kind of flaws:
>>>
>>> 1. The test covers only a single level of derivation (and back)
>>>
>>> It derives device Y from device X and then Y back to the type of X
>> and
>>> checks whether the result matches X.
>>>
>>> What it doesn't check for, are longer chains of derivation like:
>>>
>>> CUDA1 > OpenCL2 > CUDA3 and then back to OpenCL4
>>>
>>> In that case, the second derivation returns the first device (CUDA3
>> ==
>>> CUDA1), but when deriving OpenCL4, hwcontext.c was creating a new
>>> OpenCL4 context instead of returning OpenCL2, because there was no
>> link
>>> from CUDA1 to OpenCL2 (only backwards from OpenCL2 to CUDA1)
>>>
>>> If the test would check for two levels of derivation, it would have
>>> failed.
>>>
>>> This patch fixes those (yet untested) cases by introducing forward
>>> references (derived_device) in addition to the existing back
>> references
>>> (source_device).
>>>
>>> 2. hwcontext_qsv didn't properly set the source_device
>>>
>>> In case of QSV, hwcontext_qsv creates a source context internally
>>> (vaapi, dxva2 or d3d11va) without calling
>> av_hwdevice_ctx_create_derived
>>> and without setting source_device.
>>>
>>> This way, the hwcontext test ran successful, but what practically
>>> happened, was that - for example - deriving vaapi from qsv didn't
>> return
>>> the original underlying vaapi device and a new one was created
>> instead:
>>> Exactly what the test is intended to detect and prevent. It just
>>> couldn't do so, because the original device was hidden (= not set as
>> the
>>> source_device of the QSV device).
>>>
>>> This patch properly makes these setting and fixes all derivation
>>> scenarios.
>>>
>>> (at a later stage, /libavutil/tests/hwdevice should be extended to
>> check
>>> longer derivation chains as well)
>>>
>>> Signed-off-by: softworkz <softworkz at hotmail.com>
>>> ---
>>>    libavutil/hwcontext.c          | 72
>> +++++++++++++++++++++++++++++++---
>>>    libavutil/hwcontext.h          | 20 ++++++++++
>>>    libavutil/hwcontext_internal.h |  6 +++
>>>    libavutil/hwcontext_qsv.c      | 13 ++++--
>>>    4 files changed, 102 insertions(+), 9 deletions(-)
>>
>> Yeah, something like this seems fair.
> 
> :-)
> 
>> Some general comments:
>>
>> * Whenever you use derivation it creates a circular reference, so the
>> instances can never be freed in the current implementation.
> 
> It's been a while...I thought there wasn't, but looking at it now,
> it seems you are right.
> 
> How would you solve it?

Hmm.  You do need both the source and derived device to be able to keep the other alive with this form, so the strict reference-counting structure isn't going to work.  Given that, I guess it's got to do something else but I've no idea what.

>> * The thread-safety properties of the hwcontext API have been lost -
>> you can no longer operate on devices independently across threads
>> (insofar as the underlying API allows that).
>>     Maybe there is an argument that derivation is something which
>> should happen early on and therefore documenting it as thread-unsafe
>> is ok, but when hwupload/hwmap can use it inside filtergraphs that
>> just isn't going to happen (and will be violated in the FFmpeg utility
>> if filters get threaded, as is being worked on).
> 
>  From my understanding there will be a single separate thread which
> handles all filtergraph operations.
> I don't think it would even be possible (without massive changes)
> to arbitrate filter processing in parallel.
> But even if this would be implemented: the filtergraph setup (init,
> uninit, query_formats, etc.) would surely happen on a single thread.

The ffmpeg utility creates filtergraphs dynamically when the first frame is available from their inputs, so I don't see why you wouldn't allow multiple of them to be created in parallel in that case.

If you create all devices at the beginning and then give references to them to the various filters which need them (so never manipulate devices dynamically within the graph) then it would be ok, but I think you've already rejected that approach.

>> * I'm not sure that it is reasonable to ignore options.  If an
>> unrelated component derived a device before you with special options,
>> you might get that device even if you have incompatible different
>> options.
> 
> I understand what you mean, but this is outside the scope of
> this patchset, because when you would want to do this, it
> would need to be implemented for derivation in general, not
> in this patchset which only adds reverse-search to the
> existing derivation functionality.

I'm not sure what you mean by that?  The feature already exists; here is a concrete example of where you would get the wrong result:

Start with VAAPI device A.

Component P derives Vulkan device B with some extension options X.

Component Q wants the same device as P, so it derives again with extension options X and gets B.

Everything works fine for a while.

Later, unrelated component R is inserted before P and Q.  It wants a Vulkan device C with extension options Y, so it derives that.

Now component Q is broken because it gets C instead of B and has the wrong extensions enabled.

>>> diff --git a/libavutil/hwcontext.c b/libavutil/hwcontext.c
>>> index ab9ad3703e..1aea7dd5c3 100644
>>> --- a/libavutil/hwcontext.c
>>> +++ b/libavutil/hwcontext.c
>>> @@ -123,6 +123,7 @@ static const AVClass hwdevice_ctx_class = {
>>>    static void hwdevice_ctx_free(void *opaque, uint8_t *data)
>>>    {
>>>        AVHWDeviceContext *ctx = (AVHWDeviceContext*)data;
>>> +    int i;
>>>
>>>        /* uninit might still want access the hw context and the user
>>>         * free() callback might destroy it, so uninit has to be
>> called first */
>>> @@ -133,6 +134,8 @@ static void hwdevice_ctx_free(void *opaque,
>> uint8_t *data)
>>>            ctx->free(ctx);
>>>
>>>        av_buffer_unref(&ctx->internal->source_device);
>>> +    for (i = 0; i < AV_HWDEVICE_TYPE_NB; i++)
>>> +        av_buffer_unref(&ctx->internal->derived_devices[i]);
>>>
>>>        av_freep(&ctx->hwctx);
>>>        av_freep(&ctx->internal->priv);
>>> @@ -644,10 +647,31 @@ fail:
>>>        return ret;
>>>    }
>>>
>>> -int av_hwdevice_ctx_create_derived_opts(AVBufferRef **dst_ref_ptr,
>>> -                                        enum AVHWDeviceType type,
>>> -                                        AVBufferRef *src_ref,
>>> -                                        AVDictionary *options, int
>> flags)
>>> +static AVBufferRef* find_derived_hwdevice_ctx(AVBufferRef *src_ref,
>> enum AVHWDeviceType type)
>>> +{
>>> +    AVBufferRef *tmp_ref;
>>> +    AVHWDeviceContext *src_ctx;
>>> +    int i;
>>> +
>>> +    src_ctx = (AVHWDeviceContext*)src_ref->data;
>>> +    if (src_ctx->type == type)
>>> +        return src_ref;
>>> +
>>> +    for (i = 0; i < AV_HWDEVICE_TYPE_NB; i++)
>>> +        if (src_ctx->internal->derived_devices[i]) {
>>> +            tmp_ref = find_derived_hwdevice_ctx(src_ctx->internal-
>>> derived_devices[i], type);
>>> +            if (tmp_ref)
>>> +                return tmp_ref;
>>> +        }
>>> +
>>> +    return NULL;
>>> +}
>>> +
>>> +static int hwdevice_ctx_create_derived(AVBufferRef **dst_ref_ptr,
>>> +                                       enum AVHWDeviceType type,
>>> +                                       AVBufferRef *src_ref,
>>> +                                       AVDictionary *options, int
>> flags,
>>> +                                       int get_existing)
>>>    {
>>>        AVBufferRef *dst_ref = NULL, *tmp_ref;
>>>        AVHWDeviceContext *dst_ctx, *tmp_ctx;
>>> @@ -667,6 +691,18 @@ int
>> av_hwdevice_ctx_create_derived_opts(AVBufferRef **dst_ref_ptr,
>>>            tmp_ref = tmp_ctx->internal->source_device;
>>>        }
>>>
>>> +    if (get_existing) {
>>> +        tmp_ref = find_derived_hwdevice_ctx(src_ref, type);
>>> +        if (tmp_ref) {
>>> +            dst_ref = av_buffer_ref(tmp_ref);
>>> +            if (!dst_ref) {
>>> +                ret = AVERROR(ENOMEM);
>>> +                goto fail;
>>> +            }
>>> +            goto done;
>>> +        }
>>> +    }
>>> +
>>>        dst_ref = av_hwdevice_ctx_alloc(type);
>>>        if (!dst_ref) {
>>>            ret = AVERROR(ENOMEM);
>>> @@ -688,6 +724,13 @@ int
>> av_hwdevice_ctx_create_derived_opts(AVBufferRef **dst_ref_ptr,
>>>                        ret = AVERROR(ENOMEM);
>>>                        goto fail;
>>>                    }
>>> +                if (!tmp_ctx->internal->derived_devices[type]) {
>>
>> I wonder whether you only want to do this when the user made the new
>> call, not the old one?
>>
>> The semantics would perhaps feel clearer as "get or create the shared
>> derived device" rather than "get the first device derived or create a
>> new one if not".
> 
> I've been there for a moment, and then I thought that when the API
> consumer would mix API calls, e.g. first without 'get' and second
> with 'get', then the second call would not produce the expected
> result.
> 
> Let me know what you think, I have no strong opinion about this.

Can you explain your example further?

Making the shared device always opt-in seems better to me to avoid unexpected interactions.  (Like in the above example where a non-sharing component is added before everything else - when sharing is implicit this ends up being the first device derived and gets shared with others.)

>>> +                    tmp_ctx->internal->derived_devices[type] =
>> av_buffer_ref(dst_ref);
>>> +                    if (!tmp_ctx->internal->derived_devices[type])
>> {
>>> +                        ret = AVERROR(ENOMEM);
>>> +                        goto fail;
>>> +                    }
>>> +                }
>>>                    ret = av_hwdevice_ctx_init(dst_ref);
>>>                    if (ret < 0)
>>>                        goto fail;
>>> @@ -712,12 +755,29 @@ fail:
>>>        return ret;
>>>    }
>>>
>>> +int av_hwdevice_ctx_create_derived_opts(AVBufferRef **dst_ref_ptr,
>>> +                                        enum AVHWDeviceType type,
>>> +                                        AVBufferRef *src_ref,
>>> +                                        AVDictionary *options, int
>> flags)
>>> +{
>>> +    return hwdevice_ctx_create_derived(dst_ref_ptr, type, src_ref,
>>> +                                       options, flags, 0);
>>> +}
>>> +
>>> +int av_hwdevice_ctx_get_or_create_derived(AVBufferRef
>> **dst_ref_ptr,
>>> +                                          enum AVHWDeviceType type,
>>> +                                          AVBufferRef *src_ref, int
>> flags)
>>> +{
>>> +    return hwdevice_ctx_create_derived(dst_ref_ptr, type, src_ref,
>>> +                                       NULL, flags, 1);
>>> +}
>>> +
>>>    int av_hwdevice_ctx_create_derived(AVBufferRef **dst_ref_ptr,
>>>                                       enum AVHWDeviceType type,
>>>                                       AVBufferRef *src_ref, int
>> flags)
>>>    {
>>> -    return av_hwdevice_ctx_create_derived_opts(dst_ref_ptr, type,
>> src_ref,
>>> -                                               NULL, flags);
>>> +    return hwdevice_ctx_create_derived(dst_ref_ptr, type, src_ref,
>>> +                                       NULL, flags, 0);
>>>    }
>>>
>>>    static void ff_hwframe_unmap(void *opaque, uint8_t *data)
>>> diff --git a/libavutil/hwcontext.h b/libavutil/hwcontext.h
>>> index c18b7e1e8b..3785811f98 100644
>>> --- a/libavutil/hwcontext.h
>>> +++ b/libavutil/hwcontext.h
>>> @@ -37,6 +37,7 @@ enum AVHWDeviceType {
>>>        AV_HWDEVICE_TYPE_OPENCL,
>>>        AV_HWDEVICE_TYPE_MEDIACODEC,
>>>        AV_HWDEVICE_TYPE_VULKAN,
>>> +    AV_HWDEVICE_TYPE_NB,          ///< number of hw device types
>>
>> Can we avoid adding a non-constant constant to the user API?
>>
>> av_hwdevice_iterate_types() exists for this purpose.
> 
> There was a reason why this can't be used. IIRC, it was that the
> device count needs to be known at some other place where
> av_hwdevice_iterate_types() is not available.
> 
> Please see the previous discussion with Hendrik about this.

Where is that?  The only place I see this used is the array of derived devices.

Two alternative implementations without the constant spring to mind:

* A shorter array indexed by av_hwdevice_iterate_types() which would not have empty entries for devices not compatible with the current platform.

* An array of type/reference pairs.

>>>    };
>>>
>>>    typedef struct AVHWDeviceInternal AVHWDeviceInternal;
>>> @@ -328,6 +329,25 @@ int av_hwdevice_ctx_create_derived(AVBufferRef
>> **dst_ctx,
>>>                                       enum AVHWDeviceType type,
>>>                                       AVBufferRef *src_ctx, int
>> flags);
>>>
>>> +/**
>>> + * Create a new device of the specified type from an existing
>> device.
>>> + *
>>> + * This function performs the same action as
>> av_hwdevice_ctx_create_derived,
>>> + * however, if a derived device of the specified type already
>> exists,
>>> + * it returns the existing instance.
>>> + *
>>> + * @param dst_ctx On success, a reference to the newly-created
>>> + *                AVHWDeviceContext.
>>> + * @param type    The type of the new device to create.
>>> + * @param src_ctx A reference to an existing AVHWDeviceContext
>> which will be
>>> + *                used to create the new device.
>>> + * @param flags   Currently unused; should be set to zero.
>>> + * @return        Zero on success, a negative AVERROR code on
>> failure.
>>> + */
>>> +int av_hwdevice_ctx_get_or_create_derived(AVBufferRef **dst_ctx,
>>> +                                          enum AVHWDeviceType type,
>>> +                                          AVBufferRef *src_ctx, int
>> flags);
>>
>> Include the options here?  Not having them in the original call was an
>> unfortunate omission, I think it would be better to include them here
>> as well even if you don't use them immediately.
> 
> I didn't see the options being used anywhere, that's why I thought
> it would be the other way round (options param overload exists
> for legacy/compatibility reasons).
> 
> But sure, I'll change it then to include options.

- Mark


More information about the ffmpeg-devel mailing list