[FFmpeg-devel] [PATCH] avformat/matroskaenc: Allow changing the time stamp precision via option

Mon May 31 23:50:36 EEST 2021

On 2021-05-24 22:15, Andreas Rheinhardt wrote:
> michael.dirks at xaymar.com:
>> From: Michael Fabian 'Xaymar' Dirks <michael.dirks at xaymar.com>
>>
>> Adds "timestamp_precision" to the available options for Matroska muxing.
>> The option enables users and developers to change the precision of the
>> time stamps in the Matroska container up to 1 nanosecond, which can aid
>> with the proper detection of constant and variable rate content.
>>
>> Work-around fix for: 259, 6406, 7927, 8909 and 9124.
>>
>> Signed-off-by: Michael Fabian 'Xaymar' Dirks <michael.dirks at xaymar.com>
>> ---
>>   doc/muxers.texi           |  8 ++++++++
>>   libavformat/matroskaenc.c | 33 ++++++++++++++++++++++++++-------
>>   2 files changed, 34 insertions(+), 7 deletions(-)
>>
>> diff --git a/doc/muxers.texi b/doc/muxers.texi
>> index e1c6ad0829..8655be94ff 100644
>> --- a/doc/muxers.texi
>> +++ b/doc/muxers.texi
>> @@ -1583,6 +1583,14 @@ bitmap is stored bottom-up. Note that this option does not flip the bitmap
>>   which has to be done manually beforehand, e.g. by using the vflip filter.
>>   Default is @var{false} and indicates bitmap is stored top down.
>>   
>> + at item timestamp_precision
>> +Sets the timestamp precision up to 1 nanosecond for Matroska/WebM, which can
>> +improve detection of constant rate content in demuxers. Note that some poorly
>> +implemented demuxers may require a timestamp precision of 1 millisecond, so
>> +increasing it past that point may result in playback issues. Higher precision
>> +also reduces the maximum possible timestamp significantly.
> This should mention that a too high precision will increase container
> overhead.
Good point.
>
>> +Default is @var{1/1000} (1 millisecond).
>> +
>>   @end table
>>   
>>   @anchor{md5}
>> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c
>> index 186a25d920..1b911a648c 100644
>> --- a/libavformat/matroskaenc.c
>> +++ b/libavformat/matroskaenc.c
>> @@ -158,6 +158,8 @@ typedef struct MatroskaMuxContext {
>>       int                 default_mode;
>>   
>>       uint32_t            segment_uid[4];
>> +
>> +    AVRational          time_base;
>>   } MatroskaMuxContext;
>>   
>>   /** 2 bytes * 7 for EBML IDs, 7 1-byte EBML lengths, 6 1-byte uint,
>> @@ -1814,6 +1816,7 @@ static int mkv_write_header(AVFormatContext *s)
>>       const AVDictionaryEntry *tag;
>>       int ret, i, version = 2;
>>       int64_t creation_time;
>> +    int64_t time_base = 1;
>>   
>>       if (mkv->mode != MODE_WEBM ||
>>           av_dict_get(s->metadata, "stereo_mode", NULL, 0) ||
>> @@ -1850,7 +1853,10 @@ static int mkv_write_header(AVFormatContext *s)
>>           return ret;
>>       pb = mkv->info.bc;
>>   
>> -    put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, 1000000);
>> +    time_base = av_rescale_q(time_base, mkv->time_base, (AVRational){1, 1000000000});
>> +    av_log(s, AV_LOG_DEBUG, "TimestampScale is: %" PRId64 " ns\n", time_base);
>> +    put_ebml_uint(pb, MATROSKA_ID_TIMECODESCALE, time_base);
> There is a serious problem here: mkv->time_base is the time base that
> the muxer uses; yet if mkv->time_base is not a multiple of 1/1000000000,
> then av_rescale_q will have to round a bit and then the demuxer will
> read a different time base, leading to a drift of all timestamps. This
> is not acceptable.
This issue is already present in the current version of FFmpeg. Unfortunately even if you tell me this, this is not something I can change: Matroska uses nanosecond precision, and nobody has agreed on what the way forward for timestamps is. You will have to bring up such nanosecond-precision problems with the Matroska specification maintainers itself, which are currently seeking IETF standardization: https://github.com/ietf-wg-cellar/matroska-specification/issues/422
>
>> +
>>       if ((tag = av_dict_get(s->metadata, "title", NULL, 0)))
>>           put_ebml_string(pb, MATROSKA_ID_TITLE, tag->value);
>>       if (!(s->flags & AVFMT_FLAG_BITEXACT)) {
>> @@ -1883,11 +1889,11 @@ static int mkv_write_header(AVFormatContext *s)
>>           int64_t metadata_duration = get_metadata_duration(s);
>>   
>>           if (s->duration > 0) {
>> -            int64_t scaledDuration = av_rescale(s->duration, 1000, AV_TIME_BASE);
>> +            int64_t scaledDuration = av_rescale_q(s->duration, AV_TIME_BASE_Q, mkv->time_base);
>>               put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration);
>>               av_log(s, AV_LOG_DEBUG, "Write early duration from recording time = %" PRIu64 "\n", scaledDuration);
>>           } else if (metadata_duration > 0) {
>> -            int64_t scaledDuration = av_rescale(metadata_duration, 1000, AV_TIME_BASE);
>> +            int64_t scaledDuration = av_rescale_q(metadata_duration, AV_TIME_BASE_Q, mkv->time_base);
>>               put_ebml_float(pb, MATROSKA_ID_DURATION, scaledDuration);
>>               av_log(s, AV_LOG_DEBUG, "Write early duration from metadata = %" PRIu64 "\n", scaledDuration);
>>           } else if (s->pb->seekable & AVIO_SEEKABLE_NORMAL) {
>> @@ -1948,12 +1954,12 @@ static int mkv_write_header(AVFormatContext *s)
>>       // after 4k and on a keyframe
>>       if (IS_SEEKABLE(pb, mkv)) {
>>           if (mkv->cluster_time_limit < 0)
>> -            mkv->cluster_time_limit = 5000;
>> +            mkv->cluster_time_limit = av_rescale_q(5000, (AVRational){1, 1000}, mkv->time_base);
>>           if (mkv->cluster_size_limit < 0)
>>               mkv->cluster_size_limit = 5 * 1024 * 1024;
>>       } else {
>>           if (mkv->cluster_time_limit < 0)
>> -            mkv->cluster_time_limit = 1000;
>> +            mkv->cluster_time_limit = av_rescale_q(1000, (AVRational){1, 1000}, mkv->time_base);
> There are three places where you rescale this; it would be better if you
> did it unconditionally in one place (namely after this block here).
>
I did not want to cut into existing code too much, merely adjust it to the change without changing the main parts.
>>           if (mkv->cluster_size_limit < 0)
>>               mkv->cluster_size_limit = 32 * 1024;
>>       }
>> @@ -2713,6 +2719,14 @@ static int mkv_init(struct AVFormatContext *s)
>>       } else
>>           mkv->mode = MODE_MATROSKAv2;
>>   
>> +    // WebM requires a timestamp precision of 1ms.
>> +    if (mkv->mode == MODE_WEBM) {
>> +        if (av_cmp_q(mkv->time_base, (AVRational){1, 1000}) != 0) {
>> +            av_log(s, AV_LOG_ERROR, "WebM requires 1ms timestamp precision\n");
>> +            return AVERROR(EINVAL);
>> +        }
> (This could be put into the same block as the one just above that sets
> the mode.)
Fair point.
>> +    }
>> +
>>       mkv->cur_audio_pkt = av_packet_alloc();
>>       if (!mkv->cur_audio_pkt)
>>           return AVERROR(ENOMEM);
>> @@ -2738,8 +2752,8 @@ static int mkv_init(struct AVFormatContext *s)
>>               track->uid = mkv_get_uid(mkv->tracks, i, &c);
>>           }
>>   
>> -        // ms precision is the de-facto standard timescale for mkv files
>> -        avpriv_set_pts_info(st, 64, 1, 1000);
>> +        // Use user-defined timescale.
>> +        avpriv_set_pts_info(st, 64, mkv->time_base.num, mkv->time_base.den);
>>   
>>           if (st->codecpar->codec_type == AVMEDIA_TYPE_ATTACHMENT) {
>>               if (mkv->mode == MODE_WEBM) {
>> @@ -2759,6 +2773,10 @@ static int mkv_init(struct AVFormatContext *s)
>>           track->track_num_size = ebml_num_size(track->track_num);
>>       }
>>   
>> +    // Scale the configured cluster_time_limit.
>> +    if (mkv->cluster_time_limit >= 0)
>> +        mkv->cluster_time_limit = av_rescale_q(mkv->cluster_time_limit, (AVRational){1, 1000}, mkv->time_base);
>> +
>>       if (mkv->is_dash && nb_tracks != 1)
>>           return AVERROR(EINVAL);
>>   
>> @@ -2826,6 +2844,7 @@ static const AVOption options[] = {
>>       { "infer", "For each track type, mark the first track of disposition default as default; if none exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER }, 0, 0, FLAGS, "default_mode" },
>>       { "infer_no_subs", "For each track type, mark the first track of disposition default as default; for audio and video: if none exists, mark the first track as default.", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_INFER_NO_SUBS }, 0, 0, FLAGS, "default_mode" },
>>       { "passthrough", "Use the disposition flag as-is", 0, AV_OPT_TYPE_CONST, { .i64 = DEFAULT_MODE_PASSTHROUGH }, 0, 0, FLAGS, "default_mode" },
>> +    { "timestamp_precision", "Timestamp precision to use for the entire container", OFFSET(time_base), AV_OPT_TYPE_RATIONAL, { .dbl = 0.001 }, 0.000000001, INT_MAX, FLAGS},
> Using a default value that is a valid value makes it impossible to
> distinguish whether the user set a value at all; but I'd like to
> preserve that possibility.
Sorry, could you explain why you want to do this further? I see no code path that would even need to check if the user/caller has set precision, if it's left at default we get the default of 1/1000 - just as previous FFmpeg versions.
>
>>       { NULL },
>>   };
>>   
>>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".


-- 
Sincerely | Mit freundlichen Grüßen

Michael Fabian 'Xaymar' Dirks
Software Designer & Engineer