[FFmpeg-devel] [PATCH] Optimization: support for libx264's mb_info

Mon Jun 26 12:50:59 EEST 2023

On Sat, 2023-06-24 at 13:01 +0200, Anton Khirnov wrote:
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender
> and know the content is safe.
> 
> 
> 
> Quoting Carotti, Elias (2023-06-22 19:23:05)
> > On Thu, 2023-06-22 at 10:44 +0200, Anton Khirnov wrote:
> > > CAUTION: This email originated from outside of the organization.
> > > Do
> > > not click links or open attachments unless you can confirm the
> > > sender
> > > and know the content is safe.
> > > 
> > > 
> > > 
> > > Quoting Carotti, Elias (2023-06-21 17:53:09)
> > > > +
> > > > +    /**
> > > > +     * Provide macro block encoder-specific hinting
> > > > information
> > > > for the encoder
> > > > +     * processing.  It can be used to pass information about
> > > > which
> > > > macroblock
> > > > +     * can be skipped because it hasn't changed from the
> > > > corresponding one in
> > > > +     * the previous frame. This is useful for applications
> > > > which
> > > > know in
> > > > +     * advance this information to speed up real-time
> > > > encoding.
> > > > Currently only
> > > > +     * used by libx264.
> > > 
> > > I'd avoid any such claims here, because this comment will
> > > certainly
> > > not
> > > be kept up to date.
> > 
> > 
> > Agreed. It was more a statement than a claim, since I only
> > implemented
> > that :-)
> > 
> > > 
> > > > +/**
> > > > + * Allocate memory for a vector of AVVideoRect in the given
> > > > AVFrame
> > > > + * {@code frame} as AVFrameSideData of type
> > > > AV_FRAME_DATA_VIDEO_HINT_INFO.
> > > > + * The side data contains a list of rectangles for the
> > > > portions of
> > > > the frame
> > > > + * which changed from the last encoded one (and the remainder
> > > > are
> > > > assumed to be
> > > > + * changed), or, alternately (depending on the type parameter)
> > > > the
> > > > unchanged
> > > > + * ones (and the remanining ones are those which changed).
> > > > + * Macroblocks will thus be hinted either to be P_SKIP-ped or
> > > > go
> > > > through the
> > > > + * regular encoding procedure.
> > > > + */
> > > > +AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
> > > > +                                            AVVideoRect
> > > > *rects,
> > > > +                                            size_t num_rects,
> > > > +                                            AVVideoHintType
> > > > type);
> > > > +
> > > > +AVVideoHint *av_video_hint_alloc(AVVideoRect *rects,
> > > > +                                 size_t nb_rects,
> > > > +                                 AVVideoHintType type,
> > > > +                                 size_t *out_size);
> > > 
> > > If AVVideoHint is extended in the future, you will have a weird
> > > situation where some fields are set by the allocation function,
> > > while
> > > others have to be set manually by the caller.
> > > 
> > > You're also assuming the caller has an explicit array of
> > > AVVideoRect,
> > > while they may actually want to fill them in through some other
> > > means.
> > 
> > I agree on the first issue, and yes, it would be wiser to split the
> > allocation function from a the setting function.
> > Would a simple append_rectangles (the name may be different) API
> > work
> > for you?
> 
> I don't see why does there need to be a setting function. The caller
> can
> directly assign or memcpy the values, it seems just as easy to me.

We can do whatever you want. However I am not clear on how that<br>
would work.

We could have a side data creation api with the standard parameters and
another method to allocate memory so that ownership is kept by
libavutil returns a pointer to the rectangles (with bounds checking and
so on on the caller):

av_video_hint_create_side_data(AVFrame *frame, AVVideoHintType type);

AVVideoRect* av_video_hint_set_number_of_rectangles(
					AVVideoHint *video_hint,
					size_t n_rects,
					AVVideoHintType changed_flag);
(Names can change I just want to convey a possible api).

Would that work for you?

Or, do you prefer a creation api which already allocates memory and
sets the number of rectangles but doesn't copy them and that's
responsibility on the caller? 
What I'd like in this latter case is that (like now) memory would be
flat with no need for specific custom deallocators.
Something along the lines:

AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
					    size_t n_rects,
					    AVVideoHintType type);

AVVideoRect *av_video_hint_get_rects(AVVideoHint *video_hint);

Third option: side information creation api and the caller has to
alloc/realloc the rectangle buffer and hand out ownership to libavutil,
but I guess this is the worst one for various reasons.

I do not see any further option.

> 
> > I am not clear on the second issue you raise though: the thing is
> > that
> > this side information is only needed per frame and before encoding,
> > so
> > I do not see a use case where you keep adding rectangles to this
> > side
> > data. The use case I see is where you accumulate the rectangles and
> > then feed them to the encoding function and forget about them,
> > however,
> > again, if we add an append_rectangles we could easily extend it to
> > the
> > use case you're hinting at.
> 
> The case I have in mind is not modifying the data at some later
> point,
> but rather that the user has the data in some more complex structure.
> So
> in order to fill it here, they'd have to explicitly construct an
> array
> of AVVideoRect, which is an extra step.
> 
> > 
> > > 
> > > Finally, it still seems to me this is largely duplicating
> > > AVVideoEncParams and you could get your functionality by adding
> > > your
> > > AVVideoHintType there.
> > > 
> > 
> > I disagree on this last point. My first idea to avoid duplicating
> > or
> > adding unnecessary code was indeed to extend AVVideoEncParams.
> > However,
> > (please correct me if I am wrong,) to my undestanding the
> > AVVideoEncParams are currently only used at the *decoder* side to
> > convey the encoding parameters (extracted from the bitstream) used
> > by
> > the encoder to produce a stream while here we want to work the
> > other
> > way around: at the encoder's side to provide hints on how to encode
> > a
> > frame but won't affect the bitstream (aside from inducing faster
> > P_SKIPs generation.) and won't be known at the decoder's side.
> > 
> > So it seems to me it's a different semantics for which it's better
> > to
> > have an appropriate side information.
> 
> AVVideoEncParams describes the block-level parameters of an encoded
> bitstream. Yes, it is currently used to export coded bitstream
> information from decoders. But there is no restriction or assumption
> in
> the API itself that it will be used in this way and it can just as
> well
> describe the information you want a coded bitstream to have.
> 

Right,  but in this case it's not something which is going to end up
into the bitstream, since this is *not* and api to set some bitstream
properties but really just provide some information which the encoder
can exploit.

So it's definitely a different semantics and I do not think it fits
well into AVVideoEncParams (frankly I think it's wrong), however if we
are clear on this and that's what you really want and that's what we
need to do to get the patch in, well I have no issue changing the code.

Best
Elias

> --
> Anton Khirnov
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".

NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico