[FFmpeg-devel] [PATCH] Optimization: support for libx264's mb_info

Sat Jun 24 14:01:31 EEST 2023

Quoting Carotti, Elias (2023-06-22 19:23:05)
> On Thu, 2023-06-22 at 10:44 +0200, Anton Khirnov wrote:
> > CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender
> > and know the content is safe.
> > 
> > 
> > 
> > Quoting Carotti, Elias (2023-06-21 17:53:09)
> > > +
> > > +    /**
> > > +     * Provide macro block encoder-specific hinting information
> > > for the encoder
> > > +     * processing.  It can be used to pass information about which
> > > macroblock
> > > +     * can be skipped because it hasn't changed from the
> > > corresponding one in
> > > +     * the previous frame. This is useful for applications which
> > > know in
> > > +     * advance this information to speed up real-time encoding. 
> > > Currently only
> > > +     * used by libx264.
> > 
> > I'd avoid any such claims here, because this comment will certainly
> > not
> > be kept up to date.
> 
> 
> Agreed. It was more a statement than a claim, since I only implemented
> that :-)
> 
> > 
> > > +/**
> > > + * Allocate memory for a vector of AVVideoRect in the given
> > > AVFrame
> > > + * {@code frame} as AVFrameSideData of type
> > > AV_FRAME_DATA_VIDEO_HINT_INFO.
> > > + * The side data contains a list of rectangles for the portions of
> > > the frame
> > > + * which changed from the last encoded one (and the remainder are
> > > assumed to be
> > > + * changed), or, alternately (depending on the type parameter) the
> > > unchanged
> > > + * ones (and the remanining ones are those which changed).
> > > + * Macroblocks will thus be hinted either to be P_SKIP-ped or go
> > > through the
> > > + * regular encoding procedure.
> > > + */
> > > +AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
> > > +                                            AVVideoRect *rects,
> > > +                                            size_t num_rects,
> > > +                                            AVVideoHintType type);
> > > +
> > > +AVVideoHint *av_video_hint_alloc(AVVideoRect *rects,
> > > +                                 size_t nb_rects,
> > > +                                 AVVideoHintType type,
> > > +                                 size_t *out_size);
> > 
> > If AVVideoHint is extended in the future, you will have a weird
> > situation where some fields are set by the allocation function, while
> > others have to be set manually by the caller.
> > 
> > You're also assuming the caller has an explicit array of AVVideoRect,
> > while they may actually want to fill them in through some other
> > means.
> 
> I agree on the first issue, and yes, it would be wiser to split the
> allocation function from a the setting function. 
> Would a simple append_rectangles (the name may be different) API work
> for you?

I don't see why does there need to be a setting function. The caller can
directly assign or memcpy the values, it seems just as easy to me.

> I am not clear on the second issue you raise though: the thing is that
> this side information is only needed per frame and before encoding, so
> I do not see a use case where you keep adding rectangles to this side
> data. The use case I see is where you accumulate the rectangles and
> then feed them to the encoding function and forget about them, however,
> again, if we add an append_rectangles we could easily extend it to the
> use case you're hinting at.

The case I have in mind is not modifying the data at some later point,
but rather that the user has the data in some more complex structure. So
in order to fill it here, they'd have to explicitly construct an array
of AVVideoRect, which is an extra step.

> 
> > 
> > Finally, it still seems to me this is largely duplicating
> > AVVideoEncParams and you could get your functionality by adding your
> > AVVideoHintType there.
> > 
> 
> I disagree on this last point. My first idea to avoid duplicating or
> adding unnecessary code was indeed to extend AVVideoEncParams. However,
> (please correct me if I am wrong,) to my undestanding the
> AVVideoEncParams are currently only used at the *decoder* side to
> convey the encoding parameters (extracted from the bitstream) used by
> the encoder to produce a stream while here we want to work the other
> way around: at the encoder's side to provide hints on how to encode a
> frame but won't affect the bitstream (aside from inducing faster
> P_SKIPs generation.) and won't be known at the decoder's side.
> 
> So it seems to me it's a different semantics for which it's better to
> have an appropriate side information.

AVVideoEncParams describes the block-level parameters of an encoded
bitstream. Yes, it is currently used to export coded bitstream
information from decoders. But there is no restriction or assumption in
the API itself that it will be used in this way and it can just as well
describe the information you want a coded bitstream to have.

-- 
Anton Khirnov