[FFmpeg-devel] [PATCH] avformat/mov: Add support for demuxing still HEIC images
Vignesh Venkat
vigneshv at google.com
Wed Oct 4 07:19:58 EEST 2023
On Tue, Oct 3, 2023 at 7:35 PM Leo Izen <leo.izen at gmail.com> wrote:
>
> On 9/26/23 13:37, Vignesh Venkatasubramanian via ffmpeg-devel wrote:
> > They are similar to AVIF images (both use the HEIF container).
> > The only additional work needed is to parse the hvcC box and put
> > it in the extradata.
> >
> > With this patch applied, ffmpeg (when built with an HEVC decoder)
> > is able to decode the files in
> > https://github.com/nokiatech/heif/tree/gh-pages/content/images
> >
> > Partially fixes trac ticket #6521.
> >
> > Signed-off-by: Vignesh Venkatasubramanian <vigneshv at google.com>
> > ---
> > libavformat/isom.h | 2 ++
> > libavformat/mov.c | 38 +++++++++++++++++++++++++++++++++++++-
> > 2 files changed, 39 insertions(+), 1 deletion(-)
> >
> > diff --git a/libavformat/isom.h b/libavformat/isom.h
> > index 3d375d7a46..b30b9da65e 100644
> > --- a/libavformat/isom.h
> > +++ b/libavformat/isom.h
> > @@ -327,6 +327,8 @@ typedef struct MOVContext {
> > int64_t extent_offset;
> > } *avif_info;
> > int avif_info_size;
> > + int64_t hvcC_offset;
> > + int hvcC_size;
> > int interleaved_read;
> > } MOVContext;
> >
> > diff --git a/libavformat/mov.c b/libavformat/mov.c
> > index 1996e0028c..cec9cb5fe1 100644
> > --- a/libavformat/mov.c
> > +++ b/libavformat/mov.c
> > @@ -1218,7 +1218,8 @@ static int mov_read_ftyp(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> > c->isom = 1;
> > av_log(c->fc, AV_LOG_DEBUG, "ISO: File Type Major Brand: %.4s\n",(char *)&type);
> > av_dict_set(&c->fc->metadata, "major_brand", type, 0);
> > - c->is_still_picture_avif = !strncmp(type, "avif", 4);
> > + c->is_still_picture_avif = !strncmp(type, "avif", 4) ||
> > + !strncmp(type, "mif1", 4);
>
> This appears to be an unrelated change. Is it?
>
No, "mif1" is the major brand reported by HEIC files. So this is
needed to detect HEIC files.
> > minor_ver = avio_rb32(pb); /* minor version */
> > av_dict_set_int(&c->fc->metadata, "minor_version", minor_ver, 0);
> >
> > @@ -4911,6 +4912,16 @@ static int avif_add_stream(MOVContext *c, int item_id)
> > st->priv_data = sc;
> > st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
> > st->codecpar->codec_id = AV_CODEC_ID_AV1;
> > + if (c->hvcC_offset >= 0) {
> > + int ret;
> > + int64_t pos = avio_tell(c->fc->pb);
> > + st->codecpar->codec_id = AV_CODEC_ID_HEVC;
> > + avio_seek(c->fc->pb, c->hvcC_offset, SEEK_SET);
> > + ret = ff_get_extradata(c->fc, st->codecpar, c->fc->pb, c->hvcC_size);
> > + if (ret < 0)
> > + return ret;
> > + avio_seek(c->fc->pb, pos, SEEK_SET);
> > + }
>
> Will this fail on non-seekable input? If it does, do we care?
>
I tried this with a pipe as an input (cat file.heic | ffmpeg -i - ...)
and it worked. I also see plenty of unconditional avio_seek calls in
this file. So I am not sure how it works.
Anyways, I added a check to fail here if the seek does not take us to
the desired position.
> > sc->ffindex = st->index;
> > c->trak_index = st->index;
> > st->avg_frame_rate.num = st->avg_frame_rate.den = 1;
> > @@ -4953,6 +4964,8 @@ static int avif_add_stream(MOVContext *c, int item_id)
> >
> > static int mov_read_meta(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> > {
> > + c->hvcC_offset = -1;
> > + c->hvcC_size = 0;
> > while (atom.size > 8) {
> > uint32_t tag;
> > if (avio_feof(pb))
> > @@ -7826,6 +7839,28 @@ static int mov_read_iloc(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> > return atom.size;
> > }
> >
> > +static int mov_read_iprp(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> > +{
> > + int size = avio_rb32(pb);
> > + if (avio_rl32(pb) != MKTAG('i','p','c','o'))
> > + return AVERROR_INVALIDDATA;
>
> Is there a reason you use ipco here instead of iprp? I'm not saying this
> is wrong, but I am confused while you're doing it this way.
>
The control flow will jump into this function only when an "iprp" box
has been seen (pb points to the start of the contents of the iprp
box). "ipco" is a sub-box of the "iprp" box that we care about. We
cannot register sub-boxes to mov_read_default without registering the
containing box, as the containing box will be skipped if it is
unknown. Some other sub-box reading is already implemented this way
(mov_read_meta for example).
> > + size -= 8;
> > + while (size > 0) {
> > + int sub_size, sub_type;
> > + sub_size = avio_rb32(pb);
> > + sub_type = avio_rl32(pb);
> > + sub_size -= 8;
> > + size -= sub_size + 8;
> > + if (sub_type == MKTAG('h','v','c','C')) {
> > + c->hvcC_offset = avio_tell(pb);
> > + c->hvcC_size = sub_size;
> > + break;
> > + }
>
> Are these permitted to use extended-size tags? i.e. size = 1, followed
> by a big-endian 64-bit size, then followed by the tag?
>
MP4 "type" fields are always 32-bits. So they cannot be of any other size.
Thank you for the review!
> > + avio_skip(pb, sub_size);
> > + }
> > + return atom.size;
> > +}
> > +
> > static const MOVParseTableEntry mov_default_parse_table[] = {
> > { MKTAG('A','C','L','R'), mov_read_aclr },
> > { MKTAG('A','P','R','G'), mov_read_avid },
> > @@ -7933,6 +7968,7 @@ static const MOVParseTableEntry mov_default_parse_table[] = {
> > { MKTAG('p','c','m','C'), mov_read_pcmc }, /* PCM configuration box */
> > { MKTAG('p','i','t','m'), mov_read_pitm },
> > { MKTAG('e','v','c','C'), mov_read_glbl },
> > +{ MKTAG('i','p','r','p'), mov_read_iprp },
> > { 0, NULL }
> > };
> >
>
> - Leo Izen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
--
Vignesh
More information about the ffmpeg-devel
mailing list