[FFmpeg-devel] [PATCHv4] avutil/common: add av_rint64_clip

Sat Nov 14 22:03:29 CET 2015

On Sat, Nov 14, 2015 at 3:28 PM, Hendrik Leppkes <h.leppkes at gmail.com> wrote:
> On Sat, Nov 14, 2015 at 3:51 AM, Ganesh Ajjanagadde <gajjanag at mit.edu> wrote:
>> On Fri, Nov 13, 2015 at 7:17 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>>> Hi,
>>>
>>> On Fri, Nov 13, 2015 at 6:16 PM, Ganesh Ajjanagadde <gajjanag at mit.edu>
>>> wrote:
>>>
>>>> On Fri, Nov 13, 2015 at 4:52 PM, Ronald S. Bultje <rsbultje at gmail.com>
>>>> wrote:
>>>> > Hi,
>>>> >
>>>> > On Fri, Nov 13, 2015 at 4:28 PM, Ganesh Ajjanagadde <gajjanag at mit.edu>
>>>> > wrote:
>>>> >
>>>> >> On Fri, Nov 13, 2015 at 1:28 PM, Ronald S. Bultje <rsbultje at gmail.com>
>>>> >> wrote:
>>>> >> > Hi,
>>>> >> >
>>>> >> > On Fri, Nov 13, 2015 at 12:17 PM, Ganesh Ajjanagadde <
>>>> gajjanag at mit.edu>
>>>> >> > wrote:
>>>> >> >
>>>> >> >> On Fri, Nov 13, 2015 at 12:10 PM, Ronald S. Bultje <
>>>> rsbultje at gmail.com>
>>>> >> >> wrote:
>>>> >> >> > Hi Ganesh,
>>>> >> >> > On Nov 13, 2015 12:02 PM, "Ganesh Ajjanagadde" <
>>>> >> gajjanagadde at gmail.com>
>>>> >> >> > wrote:
>>>> >> >> >>
>>>> >> >> >> The rationale for this function is reflected in the documentation
>>>> for
>>>> >> >> >> it, and is copied here:
>>>> >> >> >>
>>>> >> >> >> Clip a double value into the long long amin-amax range.
>>>> >> >> >> This function is needed because conversion of floating point to
>>>> >> integers
>>>> >> >> > when
>>>> >> >> >> it does not fit in the integer's representation does not
>>>> necessarily
>>>> >> >> > saturate
>>>> >> >> >> correctly (usually converted to a cvttsd2si on x86) which
>>>> saturates
>>>> >> >> > numbers
>>>> >> >> >> > INT64_MAX to INT64_MIN. The standard marks such conversions as
>>>> >> >> undefined
>>>> >> >> >> behavior, allowing this sort of mathematically bogus conversions.
>>>> >> This
>>>> >> >> > provides
>>>> >> >> >> a safe alternative that is slower obviously but assures safety and
>>>> >> >> better
>>>> >> >> >> mathematical behavior.
>>>> >> >> >> API:
>>>> >> >> >> @param a value to clip
>>>> >> >> >> @param amin minimum value of the clip range
>>>> >> >> >> @param amax maximum value of the clip range
>>>> >> >> >> @return clipped value
>>>> >> >> >>
>>>> >> >> >> Note that a priori if one can guarantee from the calling side that
>>>> >> the
>>>> >> >> >> double is in range, it is safe to simply do an explicit/implicit
>>>> >> cast,
>>>> >> >> >> and that will be far faster. However, otherwise this function
>>>> should
>>>> >> be
>>>> >> >> >> used.
>>>> >> >> >>
>>>> >> >> >> avutil minor version is bumped.
>>>> >> >> >>
>>>> >> >> >> Reviewed-by: Ronald S. Bultje <rsbultje at gmail.com>
>>>> >> >> >> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>>>> >> >> >> ---
>>>> >> >> >>  libavutil/common.h  | 30 ++++++++++++++++++++++++++++++
>>>> >> >> >>  libavutil/version.h |  2 +-
>>>> >> >> >>  2 files changed, 31 insertions(+), 1 deletion(-)
>>>> >> >> >>
>>>> >> >> >> diff --git a/libavutil/common.h b/libavutil/common.h
>>>> >> >> >> index 6f0f582..f4687ab 100644
>>>> >> >> >> --- a/libavutil/common.h
>>>> >> >> >> +++ b/libavutil/common.h
>>>> >> >> >> @@ -298,6 +298,33 @@ static av_always_inline av_const double
>>>> >> >> > av_clipd_c(double a, double amin, double
>>>> >> >> >>      else               return a;
>>>> >> >> >>  }
>>>> >> >> >>
>>>> >> >> >> +/**
>>>> >> >> >> + * Clip and convert a double value into the long long amin-amax
>>>> >> range.
>>>> >> >> >> + * This function is needed because conversion of floating point
>>>> to
>>>> >> >> > integers when
>>>> >> >> >> + * it does not fit in the integer's representation does not
>>>> >> necessarily
>>>> >> >> > saturate
>>>> >> >> >> + * correctly (usually converted to a cvttsd2si on x86) which
>>>> >> saturates
>>>> >> >> > numbers
>>>> >> >> >> + * > INT64_MAX to INT64_MIN. The standard marks such conversions
>>>> as
>>>> >> >> > undefined
>>>> >> >> >> + * behavior, allowing this sort of mathematically bogus
>>>> conversions.
>>>> >> >> > This provides
>>>> >> >> >> + * a safe alternative that is slower obviously but assures safety
>>>> >> and
>>>> >> >> > better
>>>> >> >> >> + * mathematical behavior.
>>>> >> >> >> + * @param a value to clip
>>>> >> >> >> + * @param amin minimum value of the clip range
>>>> >> >> >> + * @param amax maximum value of the clip range
>>>> >> >> >> + * @return clipped value
>>>> >> >> >> + */
>>>> >> >> >> +static av_always_inline av_const int64_t av_rint64_clip_c(double
>>>> a,
>>>> >> >> > int64_t amin, int64_t amax)
>>>> >> >> >> +{
>>>> >> >> >> +#if defined(HAVE_AV_CONFIG_H) && defined(ASSERT_LEVEL) &&
>>>> >> ASSERT_LEVEL
>>>> >> >> >>= 2
>>>> >> >> >> +    if (amin > amax) abort();
>>>> >> >> >> +#endif
>>>> >> >> >> +    // INT64_MAX+1,INT64_MIN are exactly representable as IEEE
>>>> >> doubles
>>>> >> >> >> +    if (a >=  9223372036854775808.0 || llrint(a) >= amax)
>>>> >> >> >> +        return amax;
>>>> >> >> >> +    if (a <= -9223372036854775808.0 || llrint(a) <= amin)
>>>> >> >> >> +        return amin;
>>>> >> >> >
>>>> >> >> > Doesn't this allow negative overflows in the max check? I think you
>>>> >> need
>>>> >> >> > both overflow checks before the min/max checks. Try the next float
>>>> val
>>>> >> >> > smaller than int64_min as input with a small amax, eg 0. I bet it
>>>> >> >> returns 0
>>>> >> >> > instead of amin.
>>>> >> >>
>>>> >> >> They are needed. As you and others can clearly see, I am quite bad
>>>> >> >> with this stuff.
>>>> >> >
>>>> >> >
>>>> >> > Hm, so, getting back to my computer, I wanted to test this, and I have
>>>> >> this
>>>> >> > problem: llrint() works correctly for me for the "undefined" cases,
>>>> i.e.,
>>>> >> > it already does what you're trying to fix in av_rint64_clip_c.
>>>> >> >
>>>> >> > llrint(-10223372056756029440.000000) returns -9223372036854775808
>>>> >> > llrint(10223372056756029440.000000) returns 9223372036854775807
>>>> >> >
>>>> >> > So, how do I reproduce that llrint() overflows?
>>>> >>
>>>> >> The link I gave originally
>>>> >>
>>>> http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer
>>>> >> gives an illustration. Maybe the weird behavior happens only on
>>>> >> 9223372036854775808.0. This happens because INT64_MAX+1 is not
>>>> >> representable in long long, and hence signed overflow occurs yielding
>>>> >> INT64_MIN (of course undefined). Here is a minimal test program:
>>>> >> #include <stdio.h>
>>>> >> #include <math.h>
>>>> >>
>>>> >> int main(void) {
>>>> >>     printf("%lld\n", llrint(9223372036854775808.0));
>>>> >>     return 0;
>>>> >> }
>>>> >>
>>>> >
>>>> > 9223372036854775807
>>>> >
>>>> > Seems apple's libc got one thing right :)
>>>>
>>>> I personally am not that charitable, looking more carefully at your
>>>> asm shows a cmplesd, suggesting slowdown. Here is a source reference:
>>>> https://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/llrint.c.
>>>> As usual, Apple dumps many implementations of llrint and it is unclear
>>>> which is actually being used on OS X at the moment (see e.g other
>>>> https://opensource.apple.com/source/Libm/Libm-92/i386.subproj/llrint.c),
>>>> but I digress.
>>>>
>>>> They essentially all put special case code like the patch above. Thus
>>>> their function is inherently slower than the conformant GNU libm
>>>> implementation. A client may very well want a branch free llrint for
>>>> speed. Apple offers no performance choice here, forcing a fast llrint
>>>> to use cvt2dsi inline or equivalent. Don't know if FFmpeg is affected
>>>> by this slowdown.
>>>
>>>
>>> I think FFmpeg should consider using Apple's version as a x86
>>> implementation for av_rint64_clip :)
>>
>> I don't agree with this: it is a far less readable implementation with
>> many more lines of code, and worse yet only handles the llrint aspect
>> and not the clipping. Regardless, belongs to a separate patch/thread.
>> Pushed. Thanks all for reviews.
>>
>
> This change broke building on VS2012, llrint is apprently not available there.
> Note that this is a public header, so our compat headers ala
> avutil/libm.h cannot be included there.

Hmm, so I could create a local avpriv_llrint with some ifdefry - e.g
for GNU_C, use llrint, else use a slower implementation on the lines
of Apple's.
Any cleaner solutions?

>
> - Hendrik
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel