[FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance
Marton Balint
cus at passwd.hu
Fri May 12 01:20:02 EEST 2023
On Wed, 10 May 2023, Lance Wang wrote:
> On Sat, May 6, 2023 at 8:41 PM Devin Heitmueller <
> devin.heitmueller at ltnglobal.com> wrote:
>
>> On Sat, May 6, 2023 at 8:16 AM James Almer <jamrial at gmail.com> wrote:
>> > Can you bench with the START_TIMER and STOP_TIMER macros in timer.h?
>> > Also, define CACHED_BITSTREAM_READER in bitpacked_dec.c before including
>> > git_bits.h and test the actual implementation again, to see if it makes
>> > any difference.
>>
>> Original code:
>> 671661910 decicycles in bitpacked_dec, 1 runs, 0 skips
>> 669736380 decicycles in bitpacked_dec, 1 runs, 0 skips
>> 669370700 decicycles in bitpacked_dec, 1 runs, 0 skips
>>
>> Original code with CACHED_BITSTREAM_READER defined
>> 352599030 decicycles in bitpacked_dec, 1 runs, 0 skips
>> 336163810 decicycles in bitpacked_dec, 1 runs, 0 skips
>> 344628350 decicycles in bitpacked_dec, 1 runs, 0 skips
>>
>> My proposed versioned:
>> 257353330 decicycles in bitpacked_dec, 1 runs, 0 skips
>> 271527000 decicycles in bitpacked_dec, 1 runs, 0 skips
>> 252701500 decicycles in bitpacked_dec, 1 runs, 0 skips
>>
>>
> Yes, it's show better performance, so LGTM if nobody have plan to optimize
> the bitstream
> function.
Actually the cached bitstream reader was faster here than the manual
approach:
./ffmpeg -stream_loop 128 -threads 1 -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le -f null none -loglevel error
Old code:
821050920 decicycles in bitpacked, 1 runs, 0 skips
815402160 decicycles in bitpacked, 2 runs, 0 skips
814108410 decicycles in bitpacked, 4 runs, 0 skips
814213800 decicycles in bitpacked, 8 runs, 0 skips
815048325 decicycles in bitpacked, 16 runs, 0 skips
812866713 decicycles in bitpacked, 32 runs, 0 skips
809186523 decicycles in bitpacked, 64 runs, 0 skips
808317601 decicycles in bitpacked, 128 runs, 0 skips
With the patch:
379879920 decicycles in bitpacked, 1 runs, 0 skips
387491580 decicycles in bitpacked, 2 runs, 0 skips
397720260 decicycles in bitpacked, 4 runs, 0 skips
389581560 decicycles in bitpacked, 8 runs, 0 skips
381820635 decicycles in bitpacked, 16 runs, 0 skips
379791675 decicycles in bitpacked, 32 runs, 0 skips
379246303 decicycles in bitpacked, 64 runs, 0 skips
379221671 decicycles in bitpacked, 128 runs, 0 skips
Old code and #defined CACHED_BITSTREAM_READER 1
345122280 decicycles in bitpacked, 1 runs, 0 skips
343663020 decicycles in bitpacked, 2 runs, 0 skips
343372680 decicycles in bitpacked, 4 runs, 0 skips
342554535 decicycles in bitpacked, 8 runs, 0 skips
340816522 decicycles in bitpacked, 16 runs, 0 skips
340225672 decicycles in bitpacked, 32 runs, 0 skips
340283520 decicycles in bitpacked, 64 runs, 0 skips
339643105 decicycles in bitpacked, 128 runs, 0 skips
Regards,
Marton
More information about the ffmpeg-devel
mailing list