[FFmpeg-devel] [PATCH] Optimization of original IFF codec
Sebastian Vater
cdgs.basty
Mon Apr 26 20:08:59 CEST 2010
Hi Mans!
M?ns Rullg?rd a ?crit :
> Sebastian Vater <cdgs.basty at googlemail.com> writes:
>
>
>> Hi Mans!
>>
>> M?ns Rullg?rd a ?crit :
>>
>>> This is inefficient. You are building the table afresh on each call
>>> to the function. Make the table static const, dropping the shift, and
>>> instead shift the table value inside the loop.
>>>
>>>
>> I just benchmarked both, my solution is way faster:
>>
>
> I don't believe that, simply because it has more work to do. How did
> you benchmark it?
>
Why? The init is done only once per call, but moving the bit-shift in
the inner-loop will shift every inner-loop iteration.
Benchmarking was done by putting START_TIMER _before_ lut init and
STOP_TIMER at the very end of decodeplane:
GetBitContext gb;
unsigned i;
+ START_TIMER;
const unsigned b = (buf_size * 8) + bps - 1;
const unsigned b32 = b & ~3;
const uint32_t lut[] = {0x0000000,
[...]
BTW, the same for decodeplane32, with my patch:
basty at cdgs-basty:~/src/ffmpeg/build$ ./ffplay ../patches/Ooze.iff
FFplay version git-fb63232, Copyright (c) 2003-2010 the FFmpeg developers
built on Apr 26 2010 19:54:23 with gcc 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
configuration:
libavutil 50.14. 0 / 50.14. 0
libavcodec 52.66. 0 / 52.66. 0
libavformat 52.61. 0 / 52.61. 0
libavdevice 52. 2. 0 / 52. 2. 0
libswscale 0.10. 0 / 0.10. 0
[IFF @ 0x8b32790]Estimating duration from bitrate, this may be inaccurate
Input #0, IFF, from '../patches/Ooze.iff':
Duration: N/A, bitrate: N/A
Stream #0.0: Video: iff_byterun1, rgba, 666x536, PAR 1:1 DAR
333:268, 90k tbr, 90k tbn, 90k tbc
37660 dezicycles in decodeplane32, 1 runs, 0 skips
29235 dezicycles in decodeplane32, 2 runs, 0 skips
24687 dezicycles in decodeplane32, 4 runs, 0 skips
22337 dezicycles in decodeplane32, 8 runs, 0 skips
21055 dezicycles in decodeplane32, 16 runs, 0 skips
20382 dezicycles in decodeplane32, 32 runs, 0 skips
20107 dezicycles in decodeplane32, 64 runs, 0 skips
19890 dezicycles in decodeplane32, 128 runs, 0 skips
20120 dezicycles in decodeplane32, 256 runs, 0 skips
20015 dezicycles in decodeplane32, 512 runs, 0 skips
19850 dezicycles in decodeplane32, 1024 runs, 0 skips
19774 dezicycles in decodeplane32, 2048 runs, 0 skips
19780 dezicycles in decodeplane32, 4094 runs, 2 skips sq= 0B f=0/0
19751 dezicycles in decodeplane32, 8187 runs, 5 skips
1.93 A-V: 0.000 s:0.0 aq= 0KB vq= 0KB sq= 0B f=0/0 0/0
decodeplane32 with making all 4 lut's static and do the shift in the
inner-loop:
basty at cdgs-basty:~/src/ffmpeg/build$ ./ffplay ../patches/Ooze.iff
FFplay version git-fb63232, Copyright (c) 2003-2010 the FFmpeg developers
built on Apr 26 2010 19:54:23 with gcc 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
configuration:
libavutil 50.14. 0 / 50.14. 0
libavcodec 52.66. 0 / 52.66. 0
libavformat 52.61. 0 / 52.61. 0
libavdevice 52. 2. 0 / 52. 2. 0
libswscale 0.10. 0 / 0.10. 0
[IFF @ 0x8b32790]Estimating duration from bitrate, this may be inaccurate
Input #0, IFF, from '../patches/Ooze.iff':
Duration: N/A, bitrate: N/A
Stream #0.0: Video: iff_byterun1, rgba, 666x536, PAR 1:1 DAR
333:268, 90k tbr, 90k tbn, 90k tbc
42790 dezicycles in decodeplane32, 1 runs, 0 skips
35080 dezicycles in decodeplane32, 2 runs, 0 skips
30992 dezicycles in decodeplane32, 4 runs, 0 skips
28612 dezicycles in decodeplane32, 8 runs, 0 skips
27280 dezicycles in decodeplane32, 16 runs, 0 skips
26460 dezicycles in decodeplane32, 32 runs, 0 skips
26191 dezicycles in decodeplane32, 64 runs, 0 skips
25971 dezicycles in decodeplane32, 128 runs, 0 skips
25890 dezicycles in decodeplane32, 256 runs, 0 skips
25989 dezicycles in decodeplane32, 512 runs, 0 skips
25888 dezicycles in decodeplane32, 1024 runs, 0 skips
25839 dezicycles in decodeplane32, 2048 runs, 0 skips
25813 dezicycles in decodeplane32, 4093 runs, 3 skips sq= 0B f=0/0
25816 dezicycles in decodeplane32, 8182 runs, 10 skips
1.64 A-V: 0.000 s:0.0 aq= 0KB vq= 0KB sq= 0B f=0/0 0/0
--
Best regards,
:-) Basty/CDGS (-:
Warum ich spirituell bin? Ganz einfach, weil ich lieber nach
der Formel des Weltfriedens statt nach der Weltformel suche.
More information about the ffmpeg-devel
mailing list