[FFmpeg-devel] [PATCH] avformat/mov: Fix decoding fragmented MP4 with multiple sample entries and empty stsc

Dimitry Andric dimitry at unified-streaming.com
Wed Jul 2 15:05:05 EEST 2025


On 31 May 2025, at 20:16, Dimitry Andric <dimitry at unified-streaming.com> wrote:
> 
> On 9 May 2025, at 00:15, James Almer <jamrial at gmail.com> wrote:
>> 
>> On 5/8/2025 7:14 PM, Dimitry Andric wrote:
>>> On 28 Apr 2025, at 13:00, Dimitry Andric <dimitry at unified-streaming.com> wrote:
>>>> 
>>>> On 19 Apr 2025, at 16:27, Dimitry Andric <dimitry at unified-streaming.com> wrote:
>>>>> 
>>>>> On 10 Apr 2025, at 11:03, Dimitry Andric <dimitry at unified-streaming.com> wrote:
>>>>>> 
>>>>>> On 3 Apr 2025, at 22:02, Dimitry Andric <dimitry at unified-streaming.com> wrote:
>>>>>>> 
>>>>>>> When decoding fragmented MP4 files that have an empty stsc box, and
>>>>>>> instead contain sample description indexes in their tfhd boxes, the mov
>>>>>>> demuxer does not notify the decoder whenever the current sample
>>>>>>> description index changes. If the SPS or PPS changed sufficiently, this
>>>>>>> can lead to unexpected decoding errors.
>>>>>>> 
>>>>>>> To fix this, in mov_finalize_packet(), when stsc_data is not available,
>>>>>>> use get_frag_stream_info_from_pkt() to get at the current fragment
>>>>>>> stream info, and retrieve the current sample description index from
>>>>>>> there. Then use that index in a similar manner as the stsc case.
>>>>>>> 
>>>>>>> Signed-off-by: Dimitry Andric <dimitry at unified-streaming.com>
>>>>>>> ---
>>>>>>> libavformat/mov.c | 50 ++++++++++++++++++++++++++++-------------------
>>>>>>> 1 file changed, 30 insertions(+), 20 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>>>>>>> index 452690090c..ead89192f4 100644
>>>>>>> --- a/libavformat/mov.c
>>>>>>> +++ b/libavformat/mov.c
>>>>>>> @@ -10756,25 +10756,29 @@ static int mov_switch_root(AVFormatContext *s, int64_t target, int index)
>>>>>>> return 1;
>>>>>>> }
>>>>>>> 
>>>>>>> -static int mov_change_extradata(AVStream *st, AVPacket *pkt)
>>>>>>> +static int mov_change_extradata(AVStream *st, AVPacket *pkt, int stsd_id)
>>>>>>> {
>>>>>>> MOVStreamContext *sc = st->priv_data;
>>>>>>> uint8_t *side, *extradata;
>>>>>>> int extradata_size;
>>>>>>> 
>>>>>>> -    /* Save the current index. */
>>>>>>> -    sc->last_stsd_index = sc->stsc_data[sc->stsc_index].id - 1;
>>>>>>> +    if (stsd_id > 0 &&
>>>>>>> +        stsd_id - 1 < sc->stsd_count &&
>>>>>>> +        stsd_id - 1 != sc->last_stsd_index) {
>>>>>>> +        /* Save the current index. */
>>>>>>> +        sc->last_stsd_index = stsd_id - 1;
>>>>>>> 
>>>>>>> -    /* Notify the decoder that extradata changed. */
>>>>>>> -    extradata_size = sc->extradata_size[sc->last_stsd_index];
>>>>>>> -    extradata = sc->extradata[sc->last_stsd_index];
>>>>>>> -    if (st->discard != AVDISCARD_ALL && extradata_size > 0 && extradata) {
>>>>>>> -        side = av_packet_new_side_data(pkt,
>>>>>>> -                                       AV_PKT_DATA_NEW_EXTRADATA,
>>>>>>> -                                       extradata_size);
>>>>>>> -        if (!side)
>>>>>>> -            return AVERROR(ENOMEM);
>>>>>>> -        memcpy(side, extradata, extradata_size);
>>>>>>> +        /* Notify the decoder that extradata changed. */
>>>>>>> +        extradata_size = sc->extradata_size[sc->last_stsd_index];
>>>>>>> +        extradata = sc->extradata[sc->last_stsd_index];
>>>>>>> +        if (st->discard != AVDISCARD_ALL && extradata_size > 0 && extradata) {
>>>>>>> +            side = av_packet_new_side_data(pkt,
>>>>>>> +                                           AV_PKT_DATA_NEW_EXTRADATA,
>>>>>>> +                                           extradata_size);
>>>>>>> +            if (!side)
>>>>>>> +                return AVERROR(ENOMEM);
>>>>>>> +            memcpy(side, extradata, extradata_size);
>>>>>>> +        }
>>>>>>> }
>>>>>>> 
>>>>>>> return 0;
>>>>>>> @@ -10893,13 +10897,10 @@ static int mov_finalize_packet(AVFormatContext *s, AVStream *st, AVIndexEntry *s
>>>>>>> 
>>>>>>> /* Multiple stsd handling. */
>>>>>>> if (sc->stsc_data) {
>>>>>>> -        if (sc->stsc_data[sc->stsc_index].id > 0 &&
>>>>>>> -            sc->stsc_data[sc->stsc_index].id - 1 < sc->stsd_count &&
>>>>>>> -            sc->stsc_data[sc->stsc_index].id - 1 != sc->last_stsd_index) {
>>>>>>> -            int ret = mov_change_extradata(st, pkt);
>>>>>>> -            if (ret < 0)
>>>>>>> -                return ret;
>>>>>>> -        }
>>>>>>> +        int stsd_id = sc->stsc_data[sc->stsc_index].id;
>>>>>>> +        int ret = mov_change_extradata(st, pkt, stsd_id);
>>>>>>> +        if (ret < 0)
>>>>>>> +            return ret;
>>>>>>> 
>>>>>>>     /* Update the stsc index for the next sample */
>>>>>>>     sc->stsc_sample++;
>>>>>>> @@ -10908,6 +10909,15 @@ static int mov_finalize_packet(AVFormatContext *s, AVStream *st, AVIndexEntry *s
>>>>>>>         sc->stsc_index++;
>>>>>>>         sc->stsc_sample = 0;
>>>>>>>     }
>>>>>>> +    } else {
>>>>>>> +        MOVContext *mov = s->priv_data;
>>>>>>> +        MOVFragmentStreamInfo *frag_stream_info = get_frag_stream_info_from_pkt(&mov->frag_index, pkt, sc->id);
>>>>>>> +        if (frag_stream_info) {
>>>>>>> +            int stsd_id = frag_stream_info->stsd_id;
>>>>>>> +            int ret = mov_change_extradata(st, pkt, stsd_id);
>>>>>>> +            if (ret < 0)
>>>>>>> +                return ret;
>>>>>>> +        }
>>>>>>> }
>>>>>>> 
>>>>>>> return 0;
>>>>>>> -- 
>>>>>>> 2.43.0
>>>>>>> 
>>>>>> 
>>>>>> Any comments on this patch?
>>>>> 
>>>>> Ping :)
>>>> 
>>>> Is there any particular group of persons that "own" the mov muxer?
>>> Another ping.
>> 
>> I'll have a look seeing no one else will.
> 
> To provide some backstory here, I will attempt to explain further what
> this patch is supposed to fix. It is specifically about AVC (or possibly
> HEVC) video that has more than one referenced PPS in the elementary
> stream. (One encoder that sometimes produces this kind of video is x264,
> unless you use the --stitchable option).
> 
> In a MP4 file this can be represented by multiple sample description
> entries in the 'stsd' box, and in a progressive file there is a 'stsc'
> box which defines which samples have which sample description indexes.
> FFmpeg handles these just fine.
> 
> However, in a fragmented MP4 file the 'stsc' box is usually empty, and
> the fragments have a 'tfhd' box with a sample description index field
> instead. Such files can sometimes not be decoded properly by FFmpeg,
> since it does not call mov_change_extradata() whenever the sample
> description index changes, somewhere in the middle of the video. In that
> case, it will either complain about a bad PPS ID, or if the ID matches
> but the PPS contents does not, lots of decoding errors will occur.
> 
> This proposed patch makes it so mov_change_extradata() is called even if
> MovStreamContext's sc_data field is empty, but
> get_frag_stream_info_from_pkt() returns a valid stsd_id in its
> MOVFragmentStreamInfo. For fragmented files, mov_read_tfhd() already
> takes care of reading the stsd_id from the tfhd boxes.

Another ping.

-Dimitry



More information about the ffmpeg-devel mailing list