[FFmpeg-devel] [PATCH] avformat/mov: Fix decoding fragmented MP4 with multiple sample entries and empty stsc
Dimitry Andric
dimitry at unified-streaming.com
Wed Jul 2 15:05:05 EEST 2025
On 31 May 2025, at 20:16, Dimitry Andric <dimitry at unified-streaming.com> wrote:
>
> On 9 May 2025, at 00:15, James Almer <jamrial at gmail.com> wrote:
>>
>> On 5/8/2025 7:14 PM, Dimitry Andric wrote:
>>> On 28 Apr 2025, at 13:00, Dimitry Andric <dimitry at unified-streaming.com> wrote:
>>>>
>>>> On 19 Apr 2025, at 16:27, Dimitry Andric <dimitry at unified-streaming.com> wrote:
>>>>>
>>>>> On 10 Apr 2025, at 11:03, Dimitry Andric <dimitry at unified-streaming.com> wrote:
>>>>>>
>>>>>> On 3 Apr 2025, at 22:02, Dimitry Andric <dimitry at unified-streaming.com> wrote:
>>>>>>>
>>>>>>> When decoding fragmented MP4 files that have an empty stsc box, and
>>>>>>> instead contain sample description indexes in their tfhd boxes, the mov
>>>>>>> demuxer does not notify the decoder whenever the current sample
>>>>>>> description index changes. If the SPS or PPS changed sufficiently, this
>>>>>>> can lead to unexpected decoding errors.
>>>>>>>
>>>>>>> To fix this, in mov_finalize_packet(), when stsc_data is not available,
>>>>>>> use get_frag_stream_info_from_pkt() to get at the current fragment
>>>>>>> stream info, and retrieve the current sample description index from
>>>>>>> there. Then use that index in a similar manner as the stsc case.
>>>>>>>
>>>>>>> Signed-off-by: Dimitry Andric <dimitry at unified-streaming.com>
>>>>>>> ---
>>>>>>> libavformat/mov.c | 50 ++++++++++++++++++++++++++++-------------------
>>>>>>> 1 file changed, 30 insertions(+), 20 deletions(-)
>>>>>>>
>>>>>>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>>>>>>> index 452690090c..ead89192f4 100644
>>>>>>> --- a/libavformat/mov.c
>>>>>>> +++ b/libavformat/mov.c
>>>>>>> @@ -10756,25 +10756,29 @@ static int mov_switch_root(AVFormatContext *s, int64_t target, int index)
>>>>>>> return 1;
>>>>>>> }
>>>>>>>
>>>>>>> -static int mov_change_extradata(AVStream *st, AVPacket *pkt)
>>>>>>> +static int mov_change_extradata(AVStream *st, AVPacket *pkt, int stsd_id)
>>>>>>> {
>>>>>>> MOVStreamContext *sc = st->priv_data;
>>>>>>> uint8_t *side, *extradata;
>>>>>>> int extradata_size;
>>>>>>>
>>>>>>> - /* Save the current index. */
>>>>>>> - sc->last_stsd_index = sc->stsc_data[sc->stsc_index].id - 1;
>>>>>>> + if (stsd_id > 0 &&
>>>>>>> + stsd_id - 1 < sc->stsd_count &&
>>>>>>> + stsd_id - 1 != sc->last_stsd_index) {
>>>>>>> + /* Save the current index. */
>>>>>>> + sc->last_stsd_index = stsd_id - 1;
>>>>>>>
>>>>>>> - /* Notify the decoder that extradata changed. */
>>>>>>> - extradata_size = sc->extradata_size[sc->last_stsd_index];
>>>>>>> - extradata = sc->extradata[sc->last_stsd_index];
>>>>>>> - if (st->discard != AVDISCARD_ALL && extradata_size > 0 && extradata) {
>>>>>>> - side = av_packet_new_side_data(pkt,
>>>>>>> - AV_PKT_DATA_NEW_EXTRADATA,
>>>>>>> - extradata_size);
>>>>>>> - if (!side)
>>>>>>> - return AVERROR(ENOMEM);
>>>>>>> - memcpy(side, extradata, extradata_size);
>>>>>>> + /* Notify the decoder that extradata changed. */
>>>>>>> + extradata_size = sc->extradata_size[sc->last_stsd_index];
>>>>>>> + extradata = sc->extradata[sc->last_stsd_index];
>>>>>>> + if (st->discard != AVDISCARD_ALL && extradata_size > 0 && extradata) {
>>>>>>> + side = av_packet_new_side_data(pkt,
>>>>>>> + AV_PKT_DATA_NEW_EXTRADATA,
>>>>>>> + extradata_size);
>>>>>>> + if (!side)
>>>>>>> + return AVERROR(ENOMEM);
>>>>>>> + memcpy(side, extradata, extradata_size);
>>>>>>> + }
>>>>>>> }
>>>>>>>
>>>>>>> return 0;
>>>>>>> @@ -10893,13 +10897,10 @@ static int mov_finalize_packet(AVFormatContext *s, AVStream *st, AVIndexEntry *s
>>>>>>>
>>>>>>> /* Multiple stsd handling. */
>>>>>>> if (sc->stsc_data) {
>>>>>>> - if (sc->stsc_data[sc->stsc_index].id > 0 &&
>>>>>>> - sc->stsc_data[sc->stsc_index].id - 1 < sc->stsd_count &&
>>>>>>> - sc->stsc_data[sc->stsc_index].id - 1 != sc->last_stsd_index) {
>>>>>>> - int ret = mov_change_extradata(st, pkt);
>>>>>>> - if (ret < 0)
>>>>>>> - return ret;
>>>>>>> - }
>>>>>>> + int stsd_id = sc->stsc_data[sc->stsc_index].id;
>>>>>>> + int ret = mov_change_extradata(st, pkt, stsd_id);
>>>>>>> + if (ret < 0)
>>>>>>> + return ret;
>>>>>>>
>>>>>>> /* Update the stsc index for the next sample */
>>>>>>> sc->stsc_sample++;
>>>>>>> @@ -10908,6 +10909,15 @@ static int mov_finalize_packet(AVFormatContext *s, AVStream *st, AVIndexEntry *s
>>>>>>> sc->stsc_index++;
>>>>>>> sc->stsc_sample = 0;
>>>>>>> }
>>>>>>> + } else {
>>>>>>> + MOVContext *mov = s->priv_data;
>>>>>>> + MOVFragmentStreamInfo *frag_stream_info = get_frag_stream_info_from_pkt(&mov->frag_index, pkt, sc->id);
>>>>>>> + if (frag_stream_info) {
>>>>>>> + int stsd_id = frag_stream_info->stsd_id;
>>>>>>> + int ret = mov_change_extradata(st, pkt, stsd_id);
>>>>>>> + if (ret < 0)
>>>>>>> + return ret;
>>>>>>> + }
>>>>>>> }
>>>>>>>
>>>>>>> return 0;
>>>>>>> --
>>>>>>> 2.43.0
>>>>>>>
>>>>>>
>>>>>> Any comments on this patch?
>>>>>
>>>>> Ping :)
>>>>
>>>> Is there any particular group of persons that "own" the mov muxer?
>>> Another ping.
>>
>> I'll have a look seeing no one else will.
>
> To provide some backstory here, I will attempt to explain further what
> this patch is supposed to fix. It is specifically about AVC (or possibly
> HEVC) video that has more than one referenced PPS in the elementary
> stream. (One encoder that sometimes produces this kind of video is x264,
> unless you use the --stitchable option).
>
> In a MP4 file this can be represented by multiple sample description
> entries in the 'stsd' box, and in a progressive file there is a 'stsc'
> box which defines which samples have which sample description indexes.
> FFmpeg handles these just fine.
>
> However, in a fragmented MP4 file the 'stsc' box is usually empty, and
> the fragments have a 'tfhd' box with a sample description index field
> instead. Such files can sometimes not be decoded properly by FFmpeg,
> since it does not call mov_change_extradata() whenever the sample
> description index changes, somewhere in the middle of the video. In that
> case, it will either complain about a bad PPS ID, or if the ID matches
> but the PPS contents does not, lots of decoding errors will occur.
>
> This proposed patch makes it so mov_change_extradata() is called even if
> MovStreamContext's sc_data field is empty, but
> get_frag_stream_info_from_pkt() returns a valid stsd_id in its
> MOVFragmentStreamInfo. For fragmented files, mov_read_tfhd() already
> takes care of reading the stsd_id from the tfhd boxes.
Another ping.
-Dimitry
More information about the ffmpeg-devel
mailing list