[FFmpeg-devel] [PATCH] avutil/timestamp: keep microsecond precision in av_ts_make_time_string

Fri Mar 22 23:58:09 EET 2024

Marton Balint:
> 
> 
> On Wed, 20 Mar 2024, Andreas Rheinhardt wrote:
> 
>> Andreas Rheinhardt:
>>> Marton Balint:
>>>> av_ts_make_time_string() used "%.6g" format in the past, but this
>>>> format was
>>>> losing precision even when the timestamp to be printed was not that
>>>> large. For
>>>> example for 3 hours (10800) seconds, only 1 decimal digit was
>>>> printed, which
>>>> made this format inaccurate when it was used in e.g. the
>>>> silencedetect filter.
>>>> Other detection filters printing timestamps had similar issues.
>>>>
>>>> So let's change the used format to "%.6f" instead, we have plenty of
>>>> space in
>>>> the string buffer, we can somewhat imitate existing behaviour of %g
>>>> by trimming
>>>> ending zeroes and the potential decimal point, which can be any
>>>> non-numeric
>>>> character. In order not to trim "inf" as well, we assume that the
>>>> decimal
>>>> point does not contain the letter "f".
>>>>
>>>> We also no longer use scientific representation to make parsing and
>>>> printing
>>>> easier, because the theoretical maximum of INT64_MAX*INT32_MAX still
>>>> fits into
>>>> the string buffer in normal form.
>>>>
>>>> Since the additional processing yields more code, inlineing this
>>>> function no
>>>> longer make sense, so this commit also changes the API to actually
>>>> export the
>>>> function instead of having it inlinable in the header.
>>>>
>>>> Thanks for Allan Cady for bringing up this issue.
>>>>
>>>> Signed-off-by: Marton Balint <cus at passwd.hu>
>>>> ---
>>>>  doc/APIchanges                               |  3 ++
>>>>  libavutil/Makefile                           |  1 +
>>>>  libavutil/timestamp.c                        | 33 ++++++++++++++++++++
>>>>  libavutil/timestamp.h                        |  8 +----
>>>>  libavutil/version.h                          |  2 +-
>>>>  tests/ref/fate/filter-metadata-scdet         | 12 +++----
>>>>  tests/ref/fate/filter-metadata-silencedetect |  2 +-
>>>>  7 files changed, 46 insertions(+), 15 deletions(-)
>>>>  create mode 100644 libavutil/timestamp.c
>>>>
>>>> diff --git a/doc/APIchanges b/doc/APIchanges
>>>> index a44c8e4f10..1afde062a5 100644
>>>> --- a/doc/APIchanges
>>>> +++ b/doc/APIchanges
>>>> @@ -2,6 +2,9 @@ The last version increases of all libraries were on
>>>> 2024-03-07
>>>>
>>>>  API changes, most recent first:
>>>>
>>>> +2024-03-xx - xxxxxxxxxx - lavu 59.3.100 - timestamp.h
>>>> +  av_ts_make_time_string() is no longer an inline function. It is
>>>> now exported.
>>>> +
>>>>  2024-03-xx - xxxxxxxxxx - lavu 59.2.100 - channel_layout.h
>>>>    Add AV_CHANNEL_LAYOUT_RETYPE_FLAG_CANONICAL.
>>>>
>>>> diff --git a/libavutil/Makefile b/libavutil/Makefile
>>>> index e7709b97d0..5e75aa1855 100644
>>>> --- a/libavutil/Makefile
>>>> +++ b/libavutil/Makefile
>>>> @@ -174,6 +174,7 @@ OBJS =
>>>> adler32.o                                                        \
>>>>        
>>>> threadmessage.o                                                  \
>>>>        
>>>> time.o                                                           \
>>>>        
>>>> timecode.o                                                       \
>>>> +      
>>>> timestamp.o                                                      \
>>>>        
>>>> tree.o                                                           \
>>>>        
>>>> twofish.o                                                        \
>>>>        
>>>> utils.o                                                          \
>>>> diff --git a/libavutil/timestamp.c b/libavutil/timestamp.c
>>>> new file mode 100644
>>>> index 0000000000..06fb47e94c
>>>> --- /dev/null
>>>> +++ b/libavutil/timestamp.c
>>>> @@ -0,0 +1,33 @@
>>>> +/*
>>>> + * This file is part of FFmpeg.
>>>> + *
>>>> + * FFmpeg is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU Lesser General Public
>>>> + * License as published by the Free Software Foundation; either
>>>> + * version 2.1 of the License, or (at your option) any later version.
>>>> + *
>>>> + * FFmpeg is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> + * Lesser General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU Lesser General Public
>>>> + * License along with FFmpeg; if not, write to the Free Software
>>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
>>>> 02110-1301 USA
>>>> + */
>>>> +
>>>> +#include "timestamp.h"
>>>> +
>>>> +char *av_ts_make_time_string(char *buf, int64_t ts, const
>>>> AVRational *tb)
>>>> +{
>>>> +    if (ts == AV_NOPTS_VALUE) {
>>>> +        snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS");
>>>> +    } else {
>>>> +        int last = snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.6f",
>>>> av_q2d(*tb) * ts);
>>>> +        last = FFMIN(last, AV_TS_MAX_STRING_SIZE - 1) - 1;
>>>> +        for (; last && buf[last] == '0'; last--);
>>>> +        for (; last && buf[last] != 'f' && (buf[last] < '0' ||
>>>> buf[0] > '9'); last--);
>>>> +        buf[last + 1] = '\0';
>>>> +    }
>>>> +    return buf;
>>>> +}
>>>
>>> As has already been said before: Simply using %.6f will discard
>>> significant digits for small timestamps. E.g. 10^-7 will simply print
>>> 0.000000, which will then be converted to "0" by your code. The old code
>>> printed 1.
> 
> Admittedly we will lose precision < 10^-6 by using %.6f, but I don't
> think this will cause any real problems, so I'd rather just leave this
> as is. Or you prefer some special handling of small values? E.g. using
> %.9f or something?
> 

Yeah, special casing small values seems good. Or just keep the original
%.6g for small values.

>>>
>>> Furthermore, there are more problems:
>>> "A double argument representing an infinity is converted in one of the
>>> styles [-]inf or [-]infinity — which style is implementation-defined."
>>> Your code would trim infinity to "inf". It would be even worse with nan:
>>> "A double argument representing a NaN is converted in one of the styles
>>> [-]nan or [-]nan(n-char-sequence) — which style, and the meaning of
>>> any n-char-sequence, is implementation-defined."
>>> Furthermore, there is no guarantee that the "decimal-point character" is
>>> actually a single character (it's a char* in struct lcov after all).
>>>
>>> If I am not mistaken, then NANs and infinities can't happen here unless
>>> the timebase is malformed (denominator zero); and in this case one could
>>> use the NOPTS codepath.
> 
> So the NAN case can't happen. The multiple character decimal separator
> is not a problem because we are trimming multiple chars. The inf instead
> of infinity case should not be a problem either, inf is parseable the
> same way as infinity, strtod() supports both. But if you prefer, I can
> make the code look for 'y' as well to stop trimming, I don't think it
> really matters. I am not sure about returning NOPTS for infinity.
> 
>> While just at it: As I already mentioned, this function has a design
>> defect: It passes the timebase by pointer and not just by value. So if
>> we add a non-inlined version of this, it should be a new function that
>> receives the timebase by value and av_ts_make_time_string() should
>> instead call this new function like
>> static inline char *av_ts_make_time_string(char *buf, int64_t ts, const
>> AVRational *tb)
>> {
>>    return new_func(buf, ts, *tb);
>> }
> 
> I am uneasy about adding another public function signature for this
> purpose, because that will still have the flaw of using a locale
> dependant decimal point.
> 

I do not really get this point here. Just because it has a flaw (I'm not
certain everyone sees the locale dependcy as a flaw) means that continue
another flaw.

- Andreas