[FFmpeg-devel] [PATCH v2 2/2] swscale/output: Don't call av_pix_fmt_desc_get() in a loop

Andreas Rheinhardt andreas.rheinhardt at outlook.com
Mon Sep 19 17:36:34 EEST 2022


Michael Niedermayer:
> On Fri, Sep 16, 2022 at 04:55:39PM +0200, Andreas Rheinhardt wrote:
>> Up until now, libswscale/output.c used a macro to write
>> an output pixel which involved a call to av_pix_fmt_desc_get()
>> to find out whether the input pixel format is BE or LE
>> despite this being known at compile-time (there are templates
>> per pixfmt). Even worse, these calls are made in a loop,
>> so that e.g. there are eight calls to av_pix_fmt_desc_get()
>> for every pixel processed in yuv2rgba64_X_c_template()
>> for 64bit RGB formats.
>>
>> This commit modifies these macros to ensure that isBE()
>> is evaluated at compile-time. This saved 41184B of .text
>> for me (GCC 11.2, -O3). Of course, it also improved performance.
>> E.g. ffmpeg_g -f lavfi -i testsrc2,format=yuva420p -pix_fmt rgba64le \
>> -threads 1  -t 1:00  -f null - (which uses yuv2rgba64le_X_c,
>> which is an invocation of yuv2rgba64_X_c_template() mentioned above),
>> performance improved from 95589 to 41387 decicycles for one call
>> to yuv2packedX; for the be variant the numbers went down from
>> 76087 to 43024 decicycles.
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at outlook.com>
>> ---
>>  libswscale/output.c | 100 +++++++++++++++++++++++++-------------------
>>  1 file changed, 58 insertions(+), 42 deletions(-)
> 
> This looks alot better than before
> 
> thx
> 
> PS: i still think that broader support for compile time evaluation of 
> "pure" functions would be usefull. Ideally with minimal mess on the source
> side, more on the build tool side
> 

I agree with that. Hopefully we find a solution.

- Andreas


More information about the ffmpeg-devel mailing list