[FFmpeg-devel] [PATCH] lavu/common: Fix AV_CEIL_RSHIFT for unsigned LHS
Michael Niedermayer
michael at niedermayer.cc
Sun Oct 20 21:30:27 EEST 2024
On Fri, Oct 18, 2024 at 06:48:40PM +0100, Frank Plowman wrote:
> On 15/10/2024 21:23, Frank Plowman wrote:
> > On 14/10/2024 23:26, Michael Niedermayer wrote:
> >> On Sat, Oct 05, 2024 at 03:38:05PM -0700, Frank Plowman wrote:
> >>> The first branch of this ternary expression was intended to avoid
> >>> having two shift operations in the case the RHS is not known at
> >>> compile time. It only works if the LHS has a signed type however,
> >>> otherwise the result is invalid.
> >>
> >> If the expression is faster, why not check if its signed ?
> >>
> >> if its not faster, then the argument could be that its not
> >> faster and removes complexity
> >>
> >> thx
> >>
> >
> > In a quick microbenchmark (clang 16, AArch64), the bithack is 10% faster
> > with -O0 but there is no significant difference with -O1 and up.
>
> It has been drawn to my attention that this is causing a failure for
> VVC_HDR_UHDTV1_OpenGOP_3840x2160_50fps_HLG10_mosaic from the DVB VVC
> test suite, due to the macro's use at
> libavcodec/cbs_h266_syntax_template.c:1206
>
> In light of the lack of performance difference between the two
> implementations
Thats a bit of a jump from
"In a quick microbenchmark (clang 16, AArch64), the bithack is 10% faster
with -O0 but there is no significant difference with -O1 and up."
to a general "lack of performance difference"
Also if there is a slowdown such slowdowns accumulate and multiple reports
suggest that FFmpeg has become slower over the years. We should not be sloppy
with changes that could negatively affect speed. Noone ever said (s)he liked
the slowdown FFmpeg had over the years
Your test covers clang and aarch, your suggested changes covers all
compilers and all arhcitectures.
Its alot of work to test the main affected cases, which i agree is annoying
But even if one ignores the question about speed loss. The change is
actually still introducing other issues
-#define AV_CEIL_RSHIFT(a,b) (!av_builtin_constant_p(b) ? -((-(a)) >> (b)) \
- : ((a) + (1<<(b)) - 1) >> (b))
+#define AV_CEIL_RSHIFT(a,b) (((a) + (1<<(b)) - 1) >> (b))
Here the argument was
-((-(a)) >> (b))
only works with signed numbers, ok true, even though iam not sure why
we would need to use this with unsigned numbers.
((a) + (1<<(b)) - 1) >> (b)
only works with ints
it fails with int64_t as (1<<(b)) can overflow
also it fails with many more cases
for example, something like this
AV_CEIL_RSHIFT(0x7FFFFFFF, 30)
will overflow (a) + (1<<(b)), this will be fine with -((-(a)) >> (b))
and if you just change that to (int64_t)1 then you will have a speed
loss on 32bit systems.
-((-(a)) >> (b))
works with a broader range of values
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Awnsering whenever a program halts or runs forever is
On a turing machine, in general impossible (turings halting problem).
On any real computer, always possible as a real computer has a finite number
of states N, and will either halt in less than N cycles or never halt.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20241020/47cad38d/attachment.sig>
More information about the ffmpeg-devel
mailing list