[FFmpeg-devel] [PATCH 3/3] avcodec/h264_parser: use the keyframe heuristics flag

Andreas Rheinhardt andreas.rheinhardt at outlook.com
Sun Jul 18 05:19:58 EEST 2021


James Almer:
> On 7/17/2021 9:30 PM, Andreas Rheinhardt wrote:
>> James Almer:
>>> Tag only packets containing with IDR pictures as keyframes by
>>> default, same as
>>> the decoder.
>>> Fixes MP4 and Matroska muxers marking incorrect samples as Sync
>>> Samples in
>>> scenarios where this AVParser is used.
>>>
>>
>> 1. Could you please explain where you got the info that Matroska
>> keyframes need to be ISOBMFF Sync Samples from? Because it's actually
>> undefined what it exactly is; in case of AV1 (the only codec with a
>> detailed codec mapping) said mapping allows delayed random access points
>> to be marked as keyframes (without providing anything to actually signal
>> that these are delayed random access points), so a key frame is simply a
>> random access point. And that is how it is de-facto handled with all
>> other codecs as well. IMO it is ISOBMFF which is the outlier here.
> 
> It was an assumption considering the Matroska h264 encapsulation is
> taken verbatim from ISOBMFF. But you're right, MKVToolnix does mark
> those as key.
> 

It's not taken verbatim -- Matroska doesn't have an analog to the sync
samples table; actually, it is not even guaranteed that cues always
reference keyframes (MKVToolNix even allows to reference all blocks!),
although (lacking an alternative) demuxers operate under this assumption.

>>
>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>> ---
>>>   libavcodec/h264_parser.c           | 16 ++++++++++------
>>>   tests/fate-run.sh                  |  4 ++--
>>>   tests/fate/ffmpeg.mak              |  2 +-
>>>   tests/fate/lavf-container.mak      | 12 ++++++------
>>>   tests/fate/matroska.mak            |  2 +-
>>>   tests/ref/fate/copy-trac2211-avi   |  2 +-
>>>   tests/ref/fate/matroska-h264-remux |  4 ++--
>>>   tests/ref/fate/segment-mp4-to-ts   | 10 +++++-----
>>>   tests/ref/lavf-fate/h264.mp4       |  4 ++--
>>>   9 files changed, 30 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/libavcodec/h264_parser.c b/libavcodec/h264_parser.c
>>> index d3c56cc188..532dc462b0 100644
>>> --- a/libavcodec/h264_parser.c
>>> +++ b/libavcodec/h264_parser.c
>>> @@ -344,9 +344,11 @@ static inline int
>>> parse_nal_units(AVCodecParserContext *s,
>>>               get_ue_golomb_long(&nal.gb);  // skip first_mb_in_slice
>>>               slice_type   = get_ue_golomb_31(&nal.gb);
>>>               s->pict_type = ff_h264_golomb_to_pict_type[slice_type %
>>> 5];
>>> -            if (p->sei.recovery_point.recovery_frame_cnt >= 0) {
>>> -                /* key frame, since recovery_frame_cnt is set */
>>> -                s->key_frame = 1;
>>> +            if (s->flags & PARSER_FLAG_USE_KEYFRAME_HEURISTICS) {
>>> +                if (p->sei.recovery_point.recovery_frame_cnt >= 0) {
>>> +                    /* key frame, since recovery_frame_cnt is set */
>>> +                    s->key_frame = 1;
>>> +                }
>>
>> 2. This change means that files that don't use IDR pictures, but only
>> recovery point SEIs won't have packets marked as keyframes at all any
>> more; this affects every open-gop video. This is way worse than an
>> incorrect sync sample list created by the mp4 muxer.
> 
> I worked around this for the mpegts muxer as Kieran requested, but if
> Matroska is the same then the situation changes.
> 
> I can add a user settable demuxer option for the autoinserted parser
> instead of hardcoding the behavior, and leave the current behavior as
> the default.
> 

So the mp4 muxer would create invalid files by default and in case this
flag is set, it instead creates crippled files (where open gop random
access points are not marked as such)? Does not sound good to me.
An actual fix is to make the muxer aware of the difference between IDR
and ordinary keyframes (there is already a MOV_PARTIAL_SYNC_SAMPLE for
the latter; and there is already some parsing in case of MPEG-2).

(As much as I like to reuse the parsers for this, I don't really know
what exactly the parser should additionally return. Should this be
solved, we might need to add another field to AVPacket (given that the
new parsing API is supposed to be packet-based, said information should
be stored there; alternatively, one could also use more of AV_PKT_FLAG_*).
We do not even have to wait for the new parser API to get this
information out of the parser: We can just add new fields to
AVCodecParserContext; but we need to agree on whether this information
should be part of AVPacket and if so, how.)

- Andreas


More information about the ffmpeg-devel mailing list