[FFmpeg-devel] [PATCH 4/7] checkasm: use pointers for start/stop functions

Lynne dev at lynne.ee
Sat Jul 15 20:43:26 EEST 2023


Jul 15, 2023, 10:26 by remi at remlab.net:

> Le lauantaina 15. heinäkuuta 2023, 11.05.51 EEST Lynne a écrit :
>
>> Jul 14, 2023, 20:29 by remi at remlab.net:
>> > This makes all calls to the bench start and stop functions via
>> > function pointers. While the primary goal is to support run-time
>> > selection of the performance measurement back-end in later commits,
>> > this has the side benefit of containing platform dependencies in to
>> > checkasm.c and out of checkasm.h.
>> > ---
>> > 
>> >  tests/checkasm/checkasm.c | 33 ++++++++++++++++++++++++++++-----
>> >  tests/checkasm/checkasm.h | 31 ++++---------------------------
>> >  2 files changed, 32 insertions(+), 32 deletions(-)
>>
>> Not sure I agree with this commit, the overhead can be detectable,
>> and we have a lot of small functions with runtime a few times that
>> of a null function call.
>>
>
> I don't think the function call is ever null. The pointers are left NULL only 
> if none of the backend initialise. But then, checkasm will bail out and exit 
> before we try to benchmark anything anyway.
>
> As for the real functions, they always do *something*. None of them "just 
> return 0".
>

I meant a no-op function call to measure the overhead of function
calls themselves, complete with all the ABI stuff.



>> Can you store the function pointers out of the loop to reduce
>> the derefs needed?
>>
>
> Taking just the two loads is out of the loop should be feasible but it seems a 
> rather vain. You will still have the overhead of the indirect function call, 
> the function, and most importantly in the case of Linux perf and MacOS kperf, 
> the system calls.
>
> The only way to avoid the indirect function calls are to use IFUNC (tricky and 
> not portable), or to make horrible macros to spawn one bench loop for each 
> backend.
>
> In the end, I think we should rather aim for as constant time as possible, 
> rather than as fast as possible, so that the nop loop can estimate the 
> benchmarking overhead as well as possible. In this respect, I think it is 
> actually marginally better *not* to cache the function pointers in local 
> variables, which could end up spilled on the stack, or not, depending on local 
> compiler optimisations for any given test case.
>

I disagree, uninlining the timer fetches adds another source of
inconsistency. It may be messy, but I think accuracy here is more
important than cleanliness, especially as it's a development tool.


More information about the ffmpeg-devel mailing list