[FFmpeg-devel] [PATCH v5 00/25] Subtitle Filtering 2022

Soft Works softworkz at hotmail.com
Wed Aug 31 07:02:11 EEST 2022



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Anton Khirnov
> Sent: Wednesday, August 31, 2022 3:40 AM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v5 00/25] Subtitle Filtering 2022
> 
> Quoting Soft Works (2022-08-27 00:48:01)
> > 2. "There's no reason why this cannot be handled using the buffer
> >     and data fields"
> >
> > I had explained the reasons and in conversation on IRC, Lynne was
> > no longer insisting on this AFAIR.
> 
> I did not see this explanation, can you restate it here?

Sure. Let's look at the AVSubtitleArea struct first:


  typedef struct AVSubtitleArea {

    enum AVSubtitleType type;
    int flags;

    int x;         ///< top left corner  of area.
    int y;         ///< top left corner  of area.
    int w;         ///< width            of area.
    int h;         ///< height           of area.
    int nb_colors; ///< number of colors in bitmap palette (@ref pal).

    /**
     * Buffers and line sizes for the bitmap of this subtitle.
     *
     * @{
     */
    AVBufferRef *buf[AV_NUM_BUFFER_POINTERS];
    int linesize[AV_NUM_BUFFER_POINTERS];
    /**
     * @}
     */

    uint32_t pal[256]; ///< RGBA palette for the bitmap.

    char *text;        ///< 0-terminated plain UTF-8 text
    char *ass;         ///< 0-terminated ASS/SSA compatible event line.

  } AVSubtitleArea;



These structs are stored in AVFrame like this:


    /**
     * Number of items in the @ref subtitle_areas array.
     */
    unsigned num_subtitle_areas;

    /**
     * Array of subtitle areas, may be empty.
     */
    AVSubtitleArea **subtitle_areas;



The question was "why this cannot be handled using the buffer
and data fields?" - there are multiple reasons:

1. Area Count

In ASS subtitles it's not uncommon that animations are defined
through subtitle events (regular ass events). This can go as 
far as that dust particles are flying around on the screen and
each particle has its own subtitle event/line which defines 
how it flies from x to y and how fast, etc.
Not yet, but in a future update, the ass decoder should be 
improved in a way that it doesn't emit an individual AVFrame
for each event line but bundles subsequent events together 
when these would have the same start time and duration.
As a result, we can have AVFrames with dozens or even a hundred 
AVSubtitleArea items in extreme cases.

The buffer and data fields are an array of fixed size (currently
8). Increasing to 16 or 32 would not help: there will still be
cases where this does not suffice.

2. What exactly should be stored in frame->buf[]?

Should we store the AVSubtitleArea structs as AVBuffer there?

Or should we only store data in those buffers? If yes, which 
data? The subtitle bitmap probably. How about the subtitle 
text - maybe and area has even both. And should we also
store the palette in some frame->buf[n]?
If yes, how to keep track of which data is stored in which 
buffer? 
Further, there's a documented convention that requires that
non-zero buffers are contiguous, i.e. there must not be
an empty buffer pointer between two other a non-empty buffer
pointers. This would require to re-arrange the buffers to
close any gap when some subtitle area data is removed, zeroed
out or has been converted to another format.
Closing such gap also require to update any mapping information
("which buffer is related to which area?")

That's a lot of tedious work and every API user (or codec/filter
developer) would need to do it in the right way.



3. AVFrame Code Logic Mistmatch

A subtitle frame can have 0 to N areas, which means that a 
frame without any area is still valid. Opposed to that, existing
(code all over the place in ffmpeg) is treating an AVFrame
as invalid when the first data buffer is empty,
i.e. frame->buf[0] != NULL

To accommodate to this situation, subtitle frames are always 
creating a 1-byte buffer for buf[0].

When we would want subtitle data to be stored in those buffers,
every developer would need to be aware about the requirement
to have that dummy buffer in the first array element. When something
is to be stored, the dummy buffer would need to be freed first
before storing the actual data.
And when the last buffer gets deleted, API users would need to
know to create a new dummy buffer.


That fixed size array might be a good fit for image data, but 
it's not for subtitle data.


The approach in my patchset is simple, intuitive and everybody
can easily work with it without needing any special knowledge:


    unsigned num_subtitle_areas;
    AVSubtitleArea **subtitle_areas;

IMHO, this is the most suitable pattern for this situation.



> If you claim the other points were addressed I will look at the
> patches
> again.

Oh cool, that would be great!

Thanks,
softworkz


More information about the ffmpeg-devel mailing list