[FFmpeg-devel] [PATCH] Added support for MB_INFO

Stefano Sabatini stefasab at gmail.com
Sat Jun 18 02:41:24 EEST 2022


On date Friday 2022-06-17 17:10:25 +0200, Timo Rothenpieler wrote:
> On 17.06.2022 12:59, Carotti, Elias wrote:
[...]
> > Again, you don't have to pass the ownership, and in fact in my use case
> > I do not pass it since I actually recycle and update the same buffer
> > for subsequent frames. If you do intentionally pass the ownership you
> > need to be aware of what you are doing. As I said, I see a leak only in
> > case of a programming error.
> > Maybe we could explicitly state it in the comment section in mb_info.h:
> > if you set the deallocator you're relinquishing ownership of the
> > buffer.
> 

> I'm not sure if that's something you can sensibly do with side data?
> What if it gets buffered, copied, and so on?

This semantics is specific to libx264, but can be possibly extended to
other encoders.

AFAIU libx264 keeps the data in memory until it needs it, and it
finally calls the free when it is finished with the data. I think the
concerns are about:
1. How does the consumer (libx264 or another encoder) knows that the
   data is still valid?

This is a problem of the application injecting the data, it might use
reference counting, the deallocator in this case will unref the data
on the user application.

2. How does the application knows that the data is still in use by the
   consumer (the encoder)?

It doesn't have to, since this is handled by the consumer, when it is
finished it will call the deallocator.

> > Plus, there's one more flag (b_mb_info_update) in libx264 which allows
> > to pass back information to the caller about which macroblocks among
> > the flagged ones were actually encoded as P_SKIP. I did not add support
> > for that though but in the future somebody may want to.
> 
> Yes, it's very x264 specific.
> But the side data is generic. If for some reason x264 does not process a
> frame, or any other encoder ends up getting used, the data will leak if it
> relied on x264 to free it.

So, let's make up a use case for this.

The application injects the data, it might be for example a filter
computing the MB information and it might be unaware of what encoder
is used down in the processing pipeline.

For the simplest case the application knows that libx264 is the used
encoder, and assumes that all data will be consumed by libx264.
 
For the x264 API the problem does not exist, with generic side data
the problem can raise since side data must be agnostic of its final
destination.

So I agree, for the generic use case we can have a leak.

I see some possible ways to solve this:

1. We document the limitation and assume that when the MB metadata is
injected down in the pipeline there is a single consumer, the
application will leak otherwise. This works for the simple application
-> libx264 pipeline, but it can't work for a more generic case.

2. We provide a mechanism *at the library level* to cleanup the
unconsumed data. libavcodec sees that the encoder does not handle that
kind of user data (it might be a capability flag) and automatically
cleanups the data in that case. This would be still underkill though,
since the AVFrame is not necessarily injested into an encoder (for
example the MB info could be rendered or processed by an application
or by a filter). This means that the AVFrame unref needs to call this
side data deallocator (and that the consumer needs to "steal" the data
from the AVFrame before it is deallocated by the AVFrame unref). But
then what happens in case the AVFrame is shared by multiple endpoints
(e.g. multiple encoders)? Then you need a ref count for the side data
itself.

I don't know if this is something which is feasible with the current
side data API, and if it could be extended to support this kind of
semantics.

3. The simpler approach, we copy the data into MB info and deallocate
when we free it, but this might have a performance penalty because of
the unneeded memcpy.

> > In principle I could've put the buffer into the side information  and
> > not just pass a pointer to it but I thought that it would require
> > adding more semantics which would imply a stronger dependency on
> > libx264.
> > Right now, mb_info is a vector of uint8_t flags and the only possible
> > value - to date - is X264_MBINFO_CONSTANT. What if, say, one day
> > libx264 decides the buffer has to be a vector of uint16_t or still
> > uint8_t but the flags are packed? On the other hand, this, AFAIK, is
> > only supported by libx264. Other codecs may want to choose a different
> > semantic for a similar field and the only possibility to make it
> > generic is letting the caller handling the low level details.
> 
> I'm not aware of any other side data with a similar semantic. And I'm really
> not sure if it's sane or even valid to do it like that.
> Can't the data be entirely contained within the side data?

We can assume the same semantics of libx264 (at the moment the only
use case for this).


More information about the ffmpeg-devel mailing list