[FFmpeg-devel] [PATCH v16 01/16] global: Prepare AVFrame for subtitle handling

Sat Nov 27 04:57:29 EET 2021

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Andreas
> Rheinhardt
> Sent: Friday, November 26, 2021 11:35 AM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH v16 01/16] global: Prepare AVFrame for
> subtitle handling

Hi Andreas,

thanks for the detailed review. There was hardly anything to object or disagree.
Everything I don't respond to is implemented as suggested.

> As has already been said by others, this should be split: One patch for
> the actual new libavutil additions, one patch for the lavc decoding API,
> one patch for the encoding API, one patch for every user that is made to
> use the new APIs.

By keeping the AVSubtitle definition in its original location, it was 
possible to split this as suggested on IRC.
Thanks.

> This is not how you deprecate something: You have to add a @deprecated
> doxygen comment as well as the attribute_deprecated to make some noise.

I didn’t want the noise before things get serious ;-)

> 
> Actually, you should deprecate the whole AVSubtitle API.

Done. Do we deprecate structs as well?

Also, there's a function avcodec_get_subtitle_rect_class().

I have no idea whether it ever had any practical use. A similar function
above (avcodec_get_frame_class) is deprecated.

> > +/**
> > + * Return subtitle format from a codec descriptor
> > + *
> > + * @param codec_descriptor codec descriptor
> > + * @return                 the subtitle type (e.g. bitmap, text)
> > + */
> > +enum AVSubtitleType av_get_subtitle_format_from_codecdesc(const
> AVCodecDescriptor *codec_descriptor);
> 
> Do we need this function? It seems too trivial and as Anton has pointed
> out is flawed. And anyway, this would belong into codec_desc.h.

I had replied to Anton why it's not flawed. Moved it to codec_desc and
named it as Anton had suggested.

> >  int attribute_align_arg avcodec_send_packet(AVCodecContext *avctx, const
> AVPacket *avpkt)
> >  {
> >      AVCodecInternal *avci = avctx->internal;
> > @@ -590,6 +618,9 @@ int attribute_align_arg
> avcodec_send_packet(AVCodecContext *avctx, const AVPacke
> >      if (avpkt && !avpkt->size && avpkt->data)
> >          return AVERROR(EINVAL);
> >
> > +    if (avctx->codec_type == AVMEDIA_TYPE_SUBTITLE)
> > +        return decode_subtitle_shim(avctx, avci->buffer_frame, avpkt);
> 
> 1. This is missing a check whether buffer_frame is empty or not; in the
> latter case, you need to return AVERROR(EAGAIN); instead
> decode_subtitle_shim() destroys what is already there.
> (Said check is currently offloaded to the BSF API in the audio/video
> codepath. This is actually a violation of the BSF API.)

Added that check.

> 2. Flushing is not really implemented well:
> a) It is legal to call avcodec_send_packet() with avpkt == NULL to
> indicate EOF. If I am not mistaken, decode_subtitle2_priv() will just
> crash in this case.

Like all subtitle decoders would when supplying a NULL packet.
What should I return when a null packet is supplied?
EAGAIN or EOF?

> b) avcodec_receive_frame() only returns what is in buffer_frame; it
> never calls decode_subtitle2_priv() or the underlying decode function at
> all. This basically presumes that a subtitle decoder can only return one
> subtitle after flushing. I don't know whether this is true for our
> decoders with the delay cap.

The only one I could see with that flag is cc_dec, which is a special case
anyway as it doesn't work on regular streams.
(even this one would crash when providing a NULL packet)

> c) avcodec_receive_frame() seems to never return AVERROR_EOF after
> flushing, but always AVERROR(EAGAIN) (if the delay cap is set, the first
> call to avcodec_receive_frame() after flushing can also result in a
> frame being returned). Yet this makes no sense and is IMO an API
> violation on lavc's part.
> (While we have subtitle decoders with the delay cap, I don't know
> whether this is actually tested by fate and whether all code actually
> flushes subtitle decoders at all.)

I don't think so.

> > diff --git a/libavfilter/vf_subtitles.c b/libavfilter/vf_subtitles.c
> > index 377160c72b..6b54cf669b 100644
> > --- a/libavfilter/vf_subtitles.c
> > +++ b/libavfilter/vf_subtitles.c
> > @@ -35,14 +35,12 @@
> >  # include "libavformat/avformat.h"
> >  #endif
> >  #include "libavutil/avstring.h"
> > -#include "libavutil/imgutils.h"
> >  #include "libavutil/opt.h"
> >  #include "libavutil/parseutils.h"
> >  #include "drawutils.h"
> >  #include "avfilter.h"
> >  #include "internal.h"
> >  #include "formats.h"
> > -#include "video.h"
> >
> >  typedef struct AssContext {
> >      const AVClass *class;
> > @@ -292,6 +290,29 @@ static int attachment_is_font(AVStream * st)
> >      return 0;
> >  }
> >
> > +static int decode(AVCodecContext *avctx, AVFrame *frame, int *got_frame,
> AVPacket *pkt)
> > +{
> > +    int ret;
> > +
> > +    *got_frame = 0;
> 
> You don't really need this: You could just return 0 if no frame was
> returned, 1 if a frame was returned and < 0 on (actual, not EAGAIN) errors.
> 
> > +
> > +    if (pkt) {
> > +        ret = avcodec_send_packet(avctx, pkt);
> > +        // In particular, we don't expect AVERROR(EAGAIN), because we read
> all
> > +        // decoded frames with avcodec_receive_frame() until done.
> 
> Where do you do this? For this one would need to call
> avcodec_receive_frame() in a loop until it returns EAGAIN, but you don't
> do this.

This decode() function is copied from ffmpeg.c

> > +static int get_subtitle_buffer(AVFrame *frame)
> > +{
> > +    // Buffers in AVFrame->buf[] are not used in case of subtitle frames.
> > +    // To accomodate with existing code, checking ->buf[0] to determine
> > +    // whether a frame is ref-counted or has data, we're adding a 1-byte
> > +    // buffer here, which marks the subtitle frame to contain data.
> 
> This is a terrible hack that might be necessary for now. But I don't see
> anything

I'm not sure about the second half sentence, but what I can say is that I 
had taken the time to try and replace the checks for ->buf[0] to a frame
field, but it turned out that this change would be so big that it cannot
be part of the subtitle patchset.
(also talked about this with Hendrik a while ago)

> > +    for (i = 0; i < frame->num_subtitle_areas; i++) {
> > +        area = frame->subtitle_areas[i];
> > +        if (area) {
> 
> Who sets an incorrect num_subtitle_areas?

Maybe code that hasn't gotten reviewed by you ;-)

> > +    /**
> > +     * Display start time, relative to packet pts, in ms.
> > +     */
> > +    uint32_t subtitle_start_time;
> > +
> > +    /**
> > +     * Display end time, relative to packet pts, in ms.
> > +     */
> > +    uint32_t subtitle_end_time;
> 
> Hardcoding a millisecond timestamp precision into the API doesn't seem
> good. As does using 32bit.

Our internal text subtitle format is defined to be ASS and those
variables are reflecting just that.

> > +/**
> > + * Copies subtitle data from AVFrame to AVSubtitle.
> > + *
> > + * @deprecated This is a compatibility method for interoperability with
> > + * the legacy subtitle API.
> > + */
> > +void av_frame_get_subtitle(AVSubtitle *sub, AVFrame *frame);
> 
> 1. Missing const.
> 2. Wrong return type.
> 3. I don't see a need for this compatibility at all. As explained below,
> the only user that needs something like this is libavcodec and then it
> should live in libavcodec.
> The last two points apply to both av_frame_get_subtitle and
> av_frame_put_subtitle. Notice that the only use of any of these
> functions outside of libavcodec is remove later again (patch #8 adds an
> av_frame_get_subtitle() in ffmpeg.c, #16 removes it again) which shows
> that such a compatibility stuff is not needed outside of libavcodec.

Moved to avcodec as ff_ functions.

> > +    uint32_t pal[256]; ///< RGBA palette for the bitmap.
> > +
> > +    char *text;        ///< 0-terminated plain UTF-8 text
> > +    char *ass;         ///< 0-terminated ASS/SSA compatible event line.
> > +
> > +} AVSubtitleArea;
> 
> I presume that sizeof(AVSubtitleArea) is not supposed to be part of the
> public API/ABI (after all, you used an array of pointers to
> AVSubtitleArea in AVFrame). This needs to be documented.

Makes sense, but I'm not sure how and where?

> > +/**
> > + * Return a subtitle format corresponding to name, or AV_SUBTITLE_FMT_NONE
> > + * on error.
> > + *
> > + * @param name Subtitle format name.
> > + */
> > +enum AVSubtitleType av_get_subtitle_fmt(const char *name);
> 
> Do we need this? Given that there are many possible ways to name the
> subtitles, it doesn't seem useful to me. The only way to be sure to use
> the "authoritative" name for a subtitle format is by
> av_get_subtitle_fmt_name(). But then one doesn't need
> av_get_subtitle_fmt() at all.

I've tried to implement everything analog to the functionality we have
for audio and video. This corresponds to av_get_pix_fmt().

> > + */
> > +int av_subtitle_area2area(AVSubtitleArea *dst, const AVSubtitleArea *src,
> int copy_data);
> 
> This documentation does not mention that dst should be "blank". But this
> actually poses a problem: How does the user know that an AVSubtitleArea
> is blank given that ? For AVFrames the answer is this: An AVFrame is
> blank if it comes directly from av_frame_alloc() or av_frame_unref().
> But there is no equivalent for these functions here.
> Furthermore, is this API supposed to be a standalone API or is
> AVSubtitleArea something that is always supposed to be part of an
> AVFrame? The usage in this patchset suggests the latter:
> av_subtitle_area2area() is only used in frame.c.

AVSubtitleArea is always supposed to be part of an AVFrame.
I had thought the copy function might be useful outside, but I didn't 
need it in any of the filter.
Made it private in AVFrame now.

> > +/**
> > + * Copy data from @ref AVSubtitleArea to @ref AVSubtitleRect.
> > + *
> > + * @param dst        The target rect (@ref AVSubtitleRect).
> > + * @param src        The source area (@ref AVSubtitleArea).
> > + *
> > + * @return 0 on success.
> > + *
> > + * @deprecated This is a compatibility method for interoperability with
> > + * the legacy subtitle API.
> > + */
> > +int av_subtitle_area2rect(AVSubtitleRect *dst, const AVSubtitleArea *src);
> > +
> > +/**
> > + * Copy data from @ref AVSubtitleRect to @ref AVSubtitleArea.
> > + *
> > + * @param dst        The source area (@ref AVSubtitleArea).
> > + * @param src        The target rect (@ref AVSubtitleRect).
> > + *
> > + * @return 0 on success.
> > + *
> > + * @deprecated This is a compatibility method for interoperability with
> > + * the legacy subtitle API.
> > + */
> > +int av_subtitle_rect2area(AVSubtitleArea *dst, const AVSubtitleRect *src);
> 
> Do we need compatibility with the legacy API at all? I only see two
> usecases for this:
> 
> 1. One has a huge project using AVSubtitles that uses AVSubtitle so
> pervasively that one can't transition to the new API in one step. But I
> doubt that that's a thing.
> 2. One is forced to provide compatibility with the legacy API while also
> providing an API based upon the new API. This is of course the situation
> with libavcodec, which for this reason needs such compatibility
> functions (in libavcodec!). But I don't see a second user with the same
> needs, so I don't see why these functions (which would actually need to
> be declared as deprecated from the very beginning if public) should be
> public.

Made those private in avcodec.

Thanks again,
softworkz