[FFmpeg-devel] [PATCH] avcodec/bsf: Avoid allocation for AVBSFInternal

Tue Aug 11 12:48:59 EEST 2020

Andreas Rheinhardt (12020-08-11):
> Imagine the context to only contain elements that require a alignment of
> 4 and the internal structure to require an alignment of eight. Then it
> is perfectly possible for &ctx[1] to only have an alignment of four and
> not of eight as internal requires it.

With this, you answered to James' question to me "What is undefined in
that?".

> There are two reasons why I like my approach more than Mark's: It allows

I cannot find what you refer to "Mark's" approach.

> to hide the details of the context<->internal relationship to the place
> where allocations happen (and if one asserted as Nicolas suggested
> (av_assert2(ctx->internal == &((AVBSFCombined *)ctx)->internal), it
> would also need to be visible for the function doing the freeing, but
> even then it can still be confined to one file). After all, the internal
> of several important other structures are in private headers to be used
> by lots of code.

... and therefore, I cannot understand exactly the point you are making
here.

> The second point is another micro-optimization: If one puts the
> AVBSFContext at the beginning of AVBSFInternal, then the offsets of all
> the fields currently existing in AVBSFInternal will increase and the
> encoding of the "read address from pointer + offset" is typically
> shorter if offset is smaller. This is not a problem here as the
> structures here are small, but other structures are bigger. This could
> of course be overcome by putting the AVBSFContext at the end of
> AVBSFInternal; but somehow this feels unnatural, although I could get
> used to it.

In this, you are mistaken.

First, AFAIK, the increase in cost for accessing pointer+offset only
happens at a few specific offsets, probably related to powers of two,
which means the gain, minute as it is, only happens for the few fields
that are pushed over the edge, if there are any.

But second, and most of all, you are neglecting a much more important
optimization. If AVBSFContext is the first field of AVBSFInternal, then
they both have the same address (it is explicitly specified in the spec,
and it has been used for decades by Gtk+ for example), and therefore we
can dispense with the ctx->internal pointer and just use it directly.

We can still have two pointers in the code, one for each type, to make
it more compact and readable. But even if we do, the compiler will know
they are the same.

Note that getting rid of the .internal field reduces the size of the
structure, and is in itself another optimization.

And since we only have one pointer instead of two, it frees a register
for other uses.

So this is an even better solution than your proposal, but it requires
all uses of ->internal to be replaced. I I count correctly, that makes
only four lines to change, and for a trivial change.

Note that all this can be done while keeping the specifics and the
pointer magic hidden in a single file:

    static inline AVBSFInternal *bsf_internal(AVBSFContext *bsf) {
        return (AVBSFInternal *)bsf;
    }

Then the code only needs to write:

    AVBSFInternal *int = bsf_internal(bsf);

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20200811/b31bab70/attachment.sig>