[FFmpeg-devel] [PATCH v2 3/3] avformat/movenc: add support for fragmented TTML muxing

Fri Dec 8 19:17:36 EET 2023

On Fri, Dec 8, 2023 at 7:09 PM Jan Ekström <jeebjp at gmail.com> wrote:
>
> On Fri, Dec 8, 2023 at 7:05 PM Jan Ekström <jeebjp at gmail.com> wrote:
> >
> > On Fri, Dec 8, 2023 at 5:37 PM Dennis Mungai <dmngaie at gmail.com> wrote:
> > >
> > > On Fri, 8 Dec 2023 at 15:14, Andreas Rheinhardt <
> > > andreas.rheinhardt at outlook.com> wrote:
> > >
> > > > Jan Ekström:
> > > > > From: Jan Ekström <jan.ekstrom at 24i.com>
> > > > >
> > > > > Attempts to base the fragmentation timing on other streams
> > > > > as most receivers expect media fragments to be more or less
> > > > > aligned.
> > > > >
> > > > > Currently does not support fragmentation on subtitle track
> > > > > only, as the subtitle packet queue timings would have to be
> > > > > checked in addition to the current fragmentation timing logic.
> > > > >
> > > > > Signed-off-by: Jan Ekström <jan.ekstrom at 24i.com>
> > > > > ---
> > > > >  libavformat/movenc.c                        |    9 -
> > > > >  libavformat/movenc_ttml.c                   |  157 ++-
> > > > >  tests/fate/mov.mak                          |   21 +
> > > > >  tests/ref/fate/mov-mp4-fragmented-ttml-dfxp | 1197 +++++++++++++++++++
> > > > >  tests/ref/fate/mov-mp4-fragmented-ttml-stpp | 1197 +++++++++++++++++++
> > > >
> > > > Am I the only one who thinks that this is a bit excessive?
> > > >
> > > > >  5 files changed, 2568 insertions(+), 13 deletions(-)
> > > > >  create mode 100644 tests/ref/fate/mov-mp4-fragmented-ttml-dfxp
> > > > >  create mode 100644 tests/ref/fate/mov-mp4-fragmented-ttml-stpp
> > > > >
> > > > > diff --git a/tests/fate/mov.mak b/tests/fate/mov.mak
> > > > > index 6cb493ceab..5c44299196 100644
> > > > > --- a/tests/fate/mov.mak
> > > > > +++ b/tests/fate/mov.mak
> > > > > @@ -143,6 +143,27 @@ FATE_MOV_FFMPEG_FFPROBE-$(call TRANSCODE, TTML
> > > > SUBRIP, MP4 MOV, SRT_DEMUXER TTML
> > > > >  fate-mov-mp4-ttml-stpp: CMD = transcode srt
> > > > $(TARGET_SAMPLES)/sub/SubRip_capability_tester.srt mp4 "-map 0:s -c:s ttml
> > > > -time_base:s 1:1000" "-map 0 -c copy" "-of json -show_entries
> > > > packet:stream=index,codec_type,codec_tag_string,codec_tag,codec_name,time_base,start_time,duration_ts,duration,nb_frames,nb_read_packets:stream_tags"
> > > > >  fate-mov-mp4-ttml-dfxp: CMD = transcode srt
> > > > $(TARGET_SAMPLES)/sub/SubRip_capability_tester.srt mp4 "-map 0:s -c:s ttml
> > > > -time_base:s 1:1000 -tag:s dfxp -strict unofficial" "-map 0 -c copy" "-of
> > > > json -show_entries
> > > > packet:stream=index,codec_type,codec_tag_string,codec_tag,codec_name,time_base,start_time,duration_ts,duration,nb_frames,nb_read_packets:stream_tags"
> > > > >
> > > > > +FATE_MOV_FFMPEG_FFPROBE-$(call TRANSCODE, TTML SUBRIP, MP4 MOV,
> > > > LAVFI_INDEV SMPTEHDBARS_FILTER SRT_DEMUXER MPEG2VIDEO_ENCODER TTML_MUXER
> > > > RAWVIDEO_MUXER) += fate-mov-mp4-fragmented-ttml-stpp
> > > > > +fate-mov-mp4-fragmented-ttml-stpp: CMD = transcode srt
> > > > $(TARGET_SAMPLES)/sub/SubRip_capability_tester.srt mp4 \
> > > > > +  "-map 1:v -map 0:s \
> > > > > +   -c:v mpeg2video -b:v 2M -g 48 -sc_threshold 1000000000 \
> > > > > +   -c:s ttml -time_base:s 1:1000 \
> > > > > +   -movflags +cmaf" \
> > > > > +  "-map 0:s -c copy" \
> > > > > +  "-select_streams s -of csv -show_packets -show_data_hash crc32" \
> > > > > +  "-f lavfi -i
> > > > smptehdbars=duration=70:size=320x180:rate=24000/1001,format=yuv420p" \
> > > > > +  "" "" "rawvideo"
> > > >
> > > > Would it speed the test up if you used smaller dimensions or a smaller
> > > > bitrate?
> > > > Anyway, you probably want the "data" output format instead of rawvideo.
> > > >
> > > > > +
> > > > > +FATE_MOV_FFMPEG_FFPROBE-$(call TRANSCODE, TTML SUBRIP, ISMV MOV,
> > > > LAVFI_INDEV SMPTEHDBARS_FILTER SRT_DEMUXER MPEG2VIDEO_ENCODER TTML_MUXER
> > > > RAWVIDEO_MUXER) += fate-mov-mp4-fragmented-ttml-dfxp
> > > > > +fate-mov-mp4-fragmented-ttml-dfxp: CMD = transcode srt
> > > > $(TARGET_SAMPLES)/sub/SubRip_capability_tester.srt ismv \
> > > > > +  "-map 1:v -map 0:s \
> > > > > +   -c:v mpeg2video -b:v 2M -g 48 -sc_threshold 1000000000 \
> > > > > +   -c:s ttml -tag:s dfxp -time_base:s 1:1000" \
> > > > > +  "-map 0:s -c copy" \
> > > > > +  "-select_streams s -of csv -show_packets -show_data_hash crc32" \
> > > > > +  "-f lavfi -i
> > > > smptehdbars=duration=70:size=320x180:rate=24000/1001,format=yuv420p" \
> > > > > +  "" "" "rawvideo"
> > > > > +
> > > > >  # FIXME: Uncomment these two tests once the test files are uploaded to
> > > > the fate
> > > > >  # server.
> > > > >  # avif demuxing - still image with 1 item.
> > > >
> > >
> > > Hello Jan,
> > >
> > > Taking this note into account, and I quote:
> > >
> > >  " Currently does not support fragmentation on subtitle track only, as the
> > > subtitle packet queue timings would have to be checked in addition to the
> > > current fragmentation timing logic."
> > >
> > > Wouldn't it be ideal to have this merged until after support for
> > > fragmentation in subtitle-only tracks is complete, at the very least? That
> > > way, the fate tests for such a workflow (case in point CMAF) would
> > > therefore be feature complete?
> > > The typical workloads that depend on such functionality, such as ingesting
> > > CMFT require a subtitle-only stream be present in such a representation.
> > >
> > > See:
> > > 1.
> > > https://www.unified-streaming.com/blog/cmaf-conformance-is-this-really-cmaf
> > > 2. https://www.unified-streaming.com/blog/live-media-ingest-cmaf
> >
> > It would be ideal, but there are a few points to keep in mind:
> >
> > 1. For such streaming, you are generally required to be synchronized
> > in your fragmentation against other media (either video or audio). If
> > subtitle only fragmentation is implemented and you have a TTML-only
> > mux, then you may set something like time-based fragmentation (time of
> > your expected GOP duration or so), but nothing would make sure you are
> > fragmenting according to those other tracks.
> > 2. Subtitle-only fragmentation is possible via the API client already
> > with this implementation, which for a one-mux = one track output is
> > the only way to make sure you are in sync with those other tracks as
> > the muxer has no idea of where they are going (as they would be in
> > other AVFormatContexts).
> > 3. I have tested this code against this vendor, with the subtitles
> > together in a single mux with a track that is not sparse in order to
> > keep the fragmentation in sync.
> >
> > In other words, given how CMAF is defined I would say you are supposed
> > to be controlling all muxes from a central point as synchronization is
> > required. That is already possible with these changes. I can
> > definitely implement time-based fragmentation for TTML only muxes, but
> > I think there are some reasons to consider that not that high
> > priority.
>
> Or I guess another way would be to make sure the "fragment on each
> packet" option's logic works with a TTML-only mux, and instead of
> feeding the packet to the subtitle queue, you just fragment & output
> with each fed TTML packet.

Argh, I keep remembering things as I respond. Sorry for this.

All this stuff is specifically for *paragraph* based TTML packets. If
you feed the MP4 writer already full TTML document packets (such as
with codec copy), fragmentation should work like with any other track
even right now.

This is because actual TTML document packets do not utilize the
subtitle queue as they are already complete and timed.

Jan