[FFmpeg-devel] [PATCH] lavu/common: Fix AV_CEIL_RSHIFT for unsigned LHS

Sun Nov 10 14:17:06 EET 2024

On Mon, Oct 21, 2024 at 2:30 AM Michael Niedermayer <michael at niedermayer.cc>
wrote:

> On Fri, Oct 18, 2024 at 06:48:40PM +0100, Frank Plowman wrote:
> > On 15/10/2024 21:23, Frank Plowman wrote:
> > > On 14/10/2024 23:26, Michael Niedermayer wrote:
> > >> On Sat, Oct 05, 2024 at 03:38:05PM -0700, Frank Plowman wrote:
> > >>> The first branch of this ternary expression was intended to avoid
> > >>> having two shift operations in the case the RHS is not known at
> > >>> compile time.  It only works if the LHS has a signed type however,
> > >>> otherwise the result is invalid.
> > >>
> > >> If the expression is faster, why not check if its signed ?
>
uint16_t is okfor the RHS because -uint16_t will be promoted to int, but
uint32_t and uint64_t will not.

see "6.3.1.1 Boolean, characters, and integers" in the "ISO/IEC 9899"
"If an int can represent all values of the original type (as restricted by
the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the integer
promotions. 59) All other types are unchanged by the integer promotions."

Without the help of typeof() in C23, we cannot provide an effective way to
detect uint32_t and uint64_t.
Let us fix dvb clip issue with
https://patchwork.ffmpeg.org/project/ffmpeg/patch/20241110121314.12540-1-nuomi2021@gmail.com/
first.

Thank you.

> >>
> > >> if its not faster, then the argument could be that its not
> > >> faster and removes complexity
> > >>
> > >> thx
> > >>
> > >
> > > In a quick microbenchmark (clang 16, AArch64), the bithack is 10%
> faster
> > > with -O0 but there is no significant difference with -O1 and up.
> >
> > It has been drawn to my attention that this is causing a failure for
> > VVC_HDR_UHDTV1_OpenGOP_3840x2160_50fps_HLG10_mosaic from the DVB VVC
> > test suite, due to the macro's use at
> > libavcodec/cbs_h266_syntax_template.c:1206
> >
>
> > In light of the lack of performance difference between the two
> > implementations
>
> Thats a bit of a jump from
> "In a quick microbenchmark (clang 16, AArch64), the bithack is 10% faster
>  with -O0 but there is no significant difference with -O1 and up."
> to a general "lack of performance difference"
>
> Also if there is a slowdown such slowdowns accumulate and multiple reports
> suggest that FFmpeg has become slower over the years. We should not be
> sloppy
> with changes that could negatively affect speed. Noone ever said (s)he
> liked
> the slowdown FFmpeg had over the years
>
> Your test covers clang and aarch, your suggested changes covers all
> compilers and all arhcitectures.
> Its alot of work to test the main affected cases, which i agree is annoying
>
> But even if one ignores the question about speed loss. The change is
> actually still introducing other issues
>
> -#define AV_CEIL_RSHIFT(a,b) (!av_builtin_constant_p(b) ? -((-(a)) >> (b))
> \
> -                                                       : ((a) + (1<<(b))
> - 1) >> (b))
> +#define AV_CEIL_RSHIFT(a,b) (((a) + (1<<(b)) - 1) >> (b))
>
> Here the argument was
> -((-(a)) >> (b))
> only works with signed numbers, ok true, even though iam not sure why
> we would need to use this with unsigned numbers.
>
> ((a) + (1<<(b)) - 1) >> (b)
> only works with ints
> it fails with int64_t as (1<<(b)) can overflow
> also it fails with many more cases
> for example, something like this
> AV_CEIL_RSHIFT(0x7FFFFFFF, 30)
> will overflow (a) + (1<<(b)), this will be fine with -((-(a)) >> (b))
>
Overflow may be a separate issue; the caller must ensure there is no
overflow.

>
> and if you just change that to (int64_t)1 then you will have a speed
> loss on 32bit systems.
>
> -((-(a)) >> (b))
> works with a broader range of values
>
> thx
>
> [...]
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Awnsering whenever a program halts or runs forever is
> On a turing machine, in general impossible (turings halting problem).
> On any real computer, always possible as a real computer has a finite
> number
> of states N, and will either halt in less than N cycles or never halt.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>