[FFmpeg-devel] [PATCH] swresample/resample: speed up build_filter for Blackman-Nuttall filter

Ganesh Ajjanagadde gajjanag at mit.edu
Fri Nov 6 03:53:48 CET 2015


On Thu, Nov 5, 2015 at 9:29 PM, Michael Niedermayer
<michael at niedermayer.cc> wrote:
> On Wed, Nov 04, 2015 at 10:08:27PM -0500, Ganesh Ajjanagadde wrote:
>> This uses the trigonometric double and triple angle formulae to avoid
>> repeated (expensive) evaluation of libc's cos().
>>
>> Sample benchmark (x86-64, Haswell, GNU/Linux)
>> old:
>> 1104466600 decicycles in build_filter(loop 1000),     256 runs,      0 skips
>> 1096765286 decicycles in build_filter(loop 1000),     512 runs,      0 skips
>> 1070479590 decicycles in build_filter(loop 1000),    1024 runs,      0 skips
>>
>> new:
>> 588861423 decicycles in build_filter(loop 1000),     256 runs,      0 skips
>> 591262754 decicycles in build_filter(loop 1000),     512 runs,      0 skips
>> 577355145 decicycles in build_filter(loop 1000),    1024 runs,      0 skips
>>
>> This results in small differences with the old expression:
>> difference (worst case on [0, 2*M_PI]), argmax 0.008:
>> max diff (relative): 0.000000000000157289807188
>> blackman_old(0.008): 0.000363951585488813192382
>> blackman_new(0.008): 0.000363951585488755946507
>>
>> These are judged to be insignificant for the performance gain. PSNR to
>> reference file is unchanged up to second decimal point for instance.
>>
>> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>> ---
>>  libswresample/resample.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/libswresample/resample.c b/libswresample/resample.c
>> index a2cbb48..5d32cc2 100644
>> --- a/libswresample/resample.c
>> +++ b/libswresample/resample.c
>> @@ -171,7 +171,7 @@ static int build_filter(ResampleContext *c, void *filter, double factor, int tap
>>                  break;}
>>              case SWR_FILTER_TYPE_BLACKMAN_NUTTALL:
>>                  w = 2.0*x / (factor*tap_count) + M_PI;
>> -                y *= 0.3635819 - 0.4891775 * cos(w) + 0.1365995 * cos(2*w) - 0.0106411 * cos(3*w);
>> +                y *= 0.3635819 - 0.4891775 * cos(w) + 0.1365995 * (2*cos(w)*cos(w)-1) - 0.0106411 * (4*cos(w)*cos(w)*cos(w) - 3*cos(w));
>
> i would use a temporary variable for cos(w)
> either way LGTM
>
> thx

Ok, changed to temporary and pushed. Thanks for the review.

>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Avoid a single point of failure, be that a person or equipment.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list