[FFmpeg-devel] [PATCH] VP8 coeff decoding optimizations
Jason Garrett-Glaser
darkshikari
Mon Aug 2 20:34:26 CEST 2010
On Mon, Aug 2, 2010 at 5:38 AM, Pascal Massimino
<pascal.massimino at gmail.com> wrote:
> Jason,
>
> On Mon, Aug 2, 2010 at 1:32 AM, Jason Garrett-Glaser
> <darkshikari at gmail.com>wrote:
>
>> Attached are two mutually exclusive VP8 optimization patches.
>>
>> Approach in #1 (test.diff): simplify addressing by eliminating
>> vp8_coeff_band
>> Advantage: one less dereference, seems to be slightly faster, but
>> might depend on the mood of gcc
>>
>
> +1 here. Seems to be a tad faster than test3.diff (gcc 4.2.4 x86-64):
>
> current (timing decode_mb_coeffs()):
> 47533 dezicycles in dec, 131005 runs, 67 skips
> 47594 dezicycles in dec, 130977 runs, 95 skips
> 47681 dezicycles in dec, 131003 runs, 69 skips
> 47503 dezicycles in dec, 130997 runs, 75 skips
>
> test.diff
> 46065 dezicycles in dec, 131004 runs, 68 skips
> 46009 dezicycles in dec, 130996 runs, 76 skips
> 46119 dezicycles in dec, 131035 runs, 37 skips
> 46226 dezicycles in dec, 131000 runs, 72 skips
>
> test3.diff:
> 46255 dezicycles in dec, 131003 runs, 69 skips
> 46156 dezicycles in dec, 131009 runs, 63 skips
> 46263 dezicycles in dec, 131017 runs, 55 skips
Anyone want to bench on another arch (ARM)?
Dark Shikari
More information about the ffmpeg-devel
mailing list