[FFmpeg-devel] J2K in HEIF was: [RFC]avformat: introduce AVStreamGroup

Pierre-Anthony Lemieux pal at sandflow.com
Wed Sep 13 17:33:04 EEST 2023


On Wed, Sep 13, 2023 at 2:35 AM Tomas Härdin <git at haerdin.se> wrote:
>
> ons 2023-09-06 klockan 16:16 -0300 skrev James Almer:
> > On 9/6/2023 2:53 PM, Tomas Härdin wrote:
> > > ons 2023-09-06 klockan 11:38 -0300 skrev James Almer:
> > > > Signed-off-by: James Almer <jamrial at gmail.com>
> > > > ---
> > > > This is an initial proof of concept for AVStream groups,
> > > > something
> > > > that's
> > > > needed for quite a few existing and upcoming formats that lavf
> > > > has no
> > > > way to
> > > > currently export. Said formats define a single video or audio
> > > > stream
> > > > composed
> > > > by merging several individualy multiplexed streams within a media
> > > > file.
> > > > This is the case of HEIF, a format defining a tiled image where
> > > > each
> > > > tile is a
> > > > separate image (either hevc, av1, etc) all of which need to be
> > > > decoded
> > > > individualy and then stitched together for presentation using
> > > > container level
> > > > information;
> > >
> > > I remember this blocking HEIF as a GSoC project. Honestly the way
> > > that
> > > format is designed is immensely horrible.
> > >
> > > > MPEG-TS programs, currently exported as
> > > > AVProgram, which this new general purpose API would replace.
> > >
> > > I can foresee this being a nuisance for users accustomed to
> > > AVProgram.
> > > Also this feature borders on NLE territory. Not necessarily a bad
> > > thing, but FFmpeg is overall poorly architectured for NLE stuff. I
> > > believe I raised this issue back when lavfi was proposed, it being
> > > wholly unsuitable for NLE work.
> > >
> > >
> > > > +typedef struct AVStreamGroup {
> > > > +    /**
> > > > +     * A class for @ref avoptions. Set on stream creation.
> > > > +     */
> > > > +    const AVClass *av_class;
> > > > +
> > > > +    /**
> > > > +     * Group index in AVFormatContext.
> > > > +     */
> > > > +    int index;
> > > > +
> > > > +    /**
> > > > +     * Format-specific group ID.
> > > > +     * decoding: set by libavformat
> > > > +     * encoding: set by the user, replaced by libavformat if
> > > > left
> > > > unset
> > > > +     */
> > > > +    int id;
> > > > +
> > > > +    /**
> > > > +     * Codec parameters associated with this stream group.
> > > > Allocated
> > > > and freed
> > > > +     * by libavformat in avformat_new_stream_group() and
> > > > avformat_free_context()
> > > > +     * respectively.
> > > > +     *
> > > > +     * - demuxing: filled by libavformat on stream group
> > > > creation or
> > > > in
> > > > +     *             avformat_find_stream_info()
> > > > +     * - muxing: filled by the caller before
> > > > avformat_write_header()
> > > > +     */
> > > > +    AVCodecParameters *codecpar;
> > > > +
> > > > +    void *priv_data;
> > > > +
> > > > +    /**
> > > > +     * Number of elements in AVStreamGroup.stream_index.
> > > > +     *
> > > > +     * Set by av_stream_group_add_stream() and
> > > > av_stream_group_new_stream(), must not
> > > > +     * be modified by any other code.
> > > > +     */
> > > > +    int nb_stream_indexes;
> > > > +
> > > > +    /**
> > > > +     * A list of indexes of streams in the group. New entries
> > > > are
> > > > created with
> > > > +     * av_stream_group_add_stream() and
> > > > av_stream_group_new_stream().
> > > > +     *
> > > > +     * - demuxing: entries are created by libavformat in
> > > > avformat_open_input().
> > > > +     *             If AVFMTCTX_NOHEADER is set in ctx_flags,
> > > > then
> > > > new entries may also
> > > > +     *             appear in av_read_frame().
> > > > +     * - muxing: entries are created by the user before
> > > > avformat_write_header().
> > > > +     *
> > > > +     * Freed by libavformat in avformat_free_context().
> > > > +     */
> > > > +    int *stream_index;
> > > > +} AVStreamGroup;
> > >
> > > I see no provisions for attaching metadata, for example HEIF
> > > stitching.
> > > Putting it in coderpar seems wrong, since it is container-level
> > > metadata. We could just have an HEIF specific struct as container
> > > metadata.
> >
> > The doxy for AVCodecParameters says "This struct describes the
> > properties of an encoded stream.", so It's not about container level
> > props.
>
> It *is* container level props. The underlying codecs have no concept of
> this kind of stitching. The closest you're going to get is tiles in
> JPEG2000, but I doubt HEIF support JPEG2000.

Just an FYI.

HEIF supports JPEG 2000:

https://www.itu.int/rec/T-REC-T.815/en

One implementation:

https://github.com/strukturag/libheif/pull/874

>
> We might say "well the resulting stream group has resolution so it's
> like a codec" but see below.
>
> > Although codecpar will be used to export the merged/stitched stream
> > props like dimensions and channel layout, maybe you're right about
> > the
> > metadata because there would be a clash between actual
> > HEVC/Opus/AAC/AV1
> > extradata and the HEIF/IAMF/etc specific info if both use
> > codecpar.extradata, even if one will be in AVStream and the other in
> > AVStreamGroup.
>
> Yes, pretty much. But it's more that codecpar is pressed into service
> where it probably doesn't belong. It might be more appropriate to call
> these "essence parameters". I'm going to stick my neck out further and
> say that picture and sound essence should be handled with different
> structs, not smushed together into one struct like AVCodecParameters.
>
> /Tomas
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".


More information about the ffmpeg-devel mailing list