[FFmpeg-devel] [PATCH] avutil/timestamp: keep microsecond precision in av_ts_make_time_string

Wed Mar 20 21:51:05 EET 2024

On Wed, 20 Mar 2024, Andreas Rheinhardt wrote:

> Andreas Rheinhardt:
>> Marton Balint:
>>> av_ts_make_time_string() used "%.6g" format in the past, but this format was
>>> losing precision even when the timestamp to be printed was not that large. For
>>> example for 3 hours (10800) seconds, only 1 decimal digit was printed, which
>>> made this format inaccurate when it was used in e.g. the silencedetect filter.
>>> Other detection filters printing timestamps had similar issues.
>>>
>>> So let's change the used format to "%.6f" instead, we have plenty of space in
>>> the string buffer, we can somewhat imitate existing behaviour of %g by trimming
>>> ending zeroes and the potential decimal point, which can be any non-numeric
>>> character. In order not to trim "inf" as well, we assume that the decimal
>>> point does not contain the letter "f".
>>>
>>> We also no longer use scientific representation to make parsing and printing
>>> easier, because the theoretical maximum of INT64_MAX*INT32_MAX still fits into
>>> the string buffer in normal form.
>>>
>>> Since the additional processing yields more code, inlineing this function no
>>> longer make sense, so this commit also changes the API to actually export the
>>> function instead of having it inlinable in the header.
>>>
>>> Thanks for Allan Cady for bringing up this issue.
>>>
>>> Signed-off-by: Marton Balint <cus at passwd.hu>
>>> ---
>>>  doc/APIchanges                               |  3 ++
>>>  libavutil/Makefile                           |  1 +
>>>  libavutil/timestamp.c                        | 33 ++++++++++++++++++++
>>>  libavutil/timestamp.h                        |  8 +----
>>>  libavutil/version.h                          |  2 +-
>>>  tests/ref/fate/filter-metadata-scdet         | 12 +++----
>>>  tests/ref/fate/filter-metadata-silencedetect |  2 +-
>>>  7 files changed, 46 insertions(+), 15 deletions(-)
>>>  create mode 100644 libavutil/timestamp.c
>>>
>>> diff --git a/doc/APIchanges b/doc/APIchanges
>>> index a44c8e4f10..1afde062a5 100644
>>> --- a/doc/APIchanges
>>> +++ b/doc/APIchanges
>>> @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2024-03-07
>>>
>>>  API changes, most recent first:
>>> 
>>> +2024-03-xx - xxxxxxxxxx - lavu 59.3.100 - timestamp.h
>>> +  av_ts_make_time_string() is no longer an inline function. It is now exported.
>>> +
>>>  2024-03-xx - xxxxxxxxxx - lavu 59.2.100 - channel_layout.h
>>>    Add AV_CHANNEL_LAYOUT_RETYPE_FLAG_CANONICAL.
>>> 
>>> diff --git a/libavutil/Makefile b/libavutil/Makefile
>>> index e7709b97d0..5e75aa1855 100644
>>> --- a/libavutil/Makefile
>>> +++ b/libavutil/Makefile
>>> @@ -174,6 +174,7 @@ OBJS = adler32.o                                                        \
>>>         threadmessage.o                                                  \
>>>         time.o                                                           \
>>>         timecode.o                                                       \
>>> +       timestamp.o                                                      \
>>>         tree.o                                                           \
>>>         twofish.o                                                        \
>>>         utils.o                                                          \
>>> diff --git a/libavutil/timestamp.c b/libavutil/timestamp.c
>>> new file mode 100644
>>> index 0000000000..06fb47e94c
>>> --- /dev/null
>>> +++ b/libavutil/timestamp.c
>>> @@ -0,0 +1,33 @@
>>> +/*
>>> + * This file is part of FFmpeg.
>>> + *
>>> + * FFmpeg is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU Lesser General Public
>>> + * License as published by the Free Software Foundation; either
>>> + * version 2.1 of the License, or (at your option) any later version.
>>> + *
>>> + * FFmpeg is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> + * Lesser General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU Lesser General Public
>>> + * License along with FFmpeg; if not, write to the Free Software
>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>>> + */
>>> +
>>> +#include "timestamp.h"
>>> +
>>> +char *av_ts_make_time_string(char *buf, int64_t ts, const AVRational *tb)
>>> +{
>>> +    if (ts == AV_NOPTS_VALUE) {
>>> +        snprintf(buf, AV_TS_MAX_STRING_SIZE, "NOPTS");
>>> +    } else {
>>> +        int last = snprintf(buf, AV_TS_MAX_STRING_SIZE, "%.6f", av_q2d(*tb) * ts);
>>> +        last = FFMIN(last, AV_TS_MAX_STRING_SIZE - 1) - 1;
>>> +        for (; last && buf[last] == '0'; last--);
>>> +        for (; last && buf[last] != 'f' && (buf[last] < '0' || buf[0] > '9'); last--);
>>> +        buf[last + 1] = '\0';
>>> +    }
>>> +    return buf;
>>> +}
>> 
>> As has already been said before: Simply using %.6f will discard
>> significant digits for small timestamps. E.g. 10^-7 will simply print
>> 0.000000, which will then be converted to "0" by your code. The old code
>> printed 1.

Admittedly we will lose precision < 10^-6 by using %.6f, but I don't 
think this will cause any real problems, so I'd rather just leave this as 
is. Or you prefer some special handling of small values? E.g. using %.9f 
or something?

>> 
>> Furthermore, there are more problems:
>> "A double argument representing an infinity is converted in one of the
>> styles [-]inf or [-]infinity — which style is implementation-defined."
>> Your code would trim infinity to "inf". It would be even worse with nan:
>> "A double argument representing a NaN is converted in one of the styles
>> [-]nan or [-]nan(n-char-sequence) — which style, and the meaning of
>> any n-char-sequence, is implementation-defined."
>> Furthermore, there is no guarantee that the "decimal-point character" is
>> actually a single character (it's a char* in struct lcov after all).
>> 
>> If I am not mistaken, then NANs and infinities can't happen here unless
>> the timebase is malformed (denominator zero); and in this case one could
>> use the NOPTS codepath.

So the NAN case can't happen. The multiple character decimal separator is 
not a problem because we are trimming multiple chars. The inf instead of 
infinity case should not be a problem either, inf is parseable the 
same way as infinity, strtod() supports both. But if you prefer, I 
can make the code look for 'y' as well to stop trimming, I don't think it 
really matters. I am not sure about returning NOPTS for infinity.

> While just at it: As I already mentioned, this function has a design
> defect: It passes the timebase by pointer and not just by value. So if
> we add a non-inlined version of this, it should be a new function that
> receives the timebase by value and av_ts_make_time_string() should
> instead call this new function like
> static inline char *av_ts_make_time_string(char *buf, int64_t ts, const
> AVRational *tb)
> {
>    return new_func(buf, ts, *tb);
> }

I am uneasy about adding another public function signature for this 
purpose, because that will still have the flaw of using a locale 
dependant decimal point.

Regards,
Marton