[FFmpeg-devel] [PATCH] swresample/resample: speed up build_filter
Ganesh Ajjanagadde
gajjanag at mit.edu
Wed Nov 4 23:09:10 CET 2015
On Wed, Nov 4, 2015 at 10:07 AM, Ganesh Ajjanagadde <gajjanag at mit.edu> wrote:
> On Wed, Nov 4, 2015 at 9:44 AM, Michael Niedermayer
> <michael at niedermayer.cc> wrote:
>> On Wed, Nov 04, 2015 at 01:11:45AM -0500, Ganesh Ajjanagadde wrote:
>>> This speeds up build_filter by ~ 50%. This gain should be pretty
>>> consistent across all architectures and platforms.
>>>
>>> Essentially, this relies on a observation that the filters have some
>>> even/odd symmetry that may be exploited during the construction of the
>>> polyphase filter bank. In particular, phases (scaled to [0, 1]) in [0.5, 1] are
>>> easily derived from [0, 0.5] and expensive reevaluation of function
>>> points are unnecessary. This requires some rather annoying even/odd
>>> bookkeeping as can be seen from the patch.
>>>
>>> I vaguely recall from signal processing theory more general symmetries allowing even greater
>>> optimization of the construction. At a high level, "even functions"
>>> correspond to 2, and one can imagine variations. Nevertheless, for the sake
>>> of some generality and because of existing filters, this is all that is
>>> being exploited.
>>>
>>> Currently, this patch relies on phase_count being even or (trivially) 1,
>>> though this is not an inherent limitation to the approach. This
>>> assumption is safe as phase_count is 1 << phase_bits, and is hence a
>>> power of two. There is no way for user API to set it to a nontrivial odd
>>> number. This assumption has been placed as an assert in the code.
>>>
>>> To repeat, this assumes even symmetry of the filters, which is the most common
>>> way to get generalized linear phase anyway and is true of all currently
>>> supported filters.
>>>
>>> As a side note, accuracy should be identical or perhaps slightly better
>>> due to this "forcing" filter symmetries leading to a better phase
>>> characteristic. As before, I can't test this claim easily, though it may
>>> be of interest.
>>>
>>> Patch tested with FATE.
>>>
>>> Sample benchmark (x86-64, Haswell, GNU/Linux):
>>>
>>> test: swr-resample-dblp-44100-2626
>>>
>>> new:
>>> 527376779 decicycles in build_filter(loop 1000), 256 runs, 0 skips
>>> 524361765 decicycles in build_filter(loop 1000), 512 runs, 0 skips
>>> 516552574 decicycles in build_filter(loop 1000), 1024 runs, 0 skips
>>>
>>> old:
>>> 974178658 decicycles in build_filter(loop 1000), 256 runs, 0 skips
>>> 972794408 decicycles in build_filter(loop 1000), 512 runs, 0 skips
>>> 954350046 decicycles in build_filter(loop 1000), 1024 runs, 0 skips
>>>
>>> Note that lower level optimizations are entirely possible, I focussed on
>>> getting the high level semantics correct. In any case, this should
>>> provide a good foundation.
>>>
>>> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>>> ---
>>> libswresample/resample.c | 56 +++++++++++++++++++++++++++++++++++++++++-------
>>> 1 file changed, 48 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/libswresample/resample.c b/libswresample/resample.c
>>> index 235c9a9..d5fc5e7 100644
>>> --- a/libswresample/resample.c
>>> +++ b/libswresample/resample.c
>>> @@ -144,7 +144,7 @@ static int build_filter(ResampleContext *c, void *filter, double factor, int tap
>>> int filter_type, int kaiser_beta){
>>> int ph, i;
>>> double x, y, w;
>>> - double *tab = av_malloc_array(tap_count, sizeof(*tab));
>>> + double *tab = av_malloc_array(tap_count+1, sizeof(*tab));
>>> const int center= (tap_count-1)/2;
>>>
>>> if (!tab)
>>> @@ -154,9 +154,10 @@ static int build_filter(ResampleContext *c, void *filter, double factor, int tap
>>> if (factor > 1.0)
>>> factor = 1.0;
>>>
>>> - for(ph=0;ph<phase_count;ph++) {
>>> + av_assert0(phase_count == 1 || phase_count % 2 == 0);
>>> + for(ph = 0; ph <= phase_count / 2; ph++) {
>>> double norm = 0;
>>> - for(i=0;i<tap_count;i++) {
>>> + for(i=0;i<=tap_count;i++) {
>>> x = M_PI * ((double)(i - center) - (double)ph / phase_count) * factor;
>>> if (x == 0) y = 1.0;
>>> else y = sin(x) / x;
>>> @@ -180,26 +181,65 @@ static int build_filter(ResampleContext *c, void *filter, double factor, int tap
>>> }
>>>
>>> tab[i] = y;
>>> - norm += y;
>>> + if (i < tap_count)
>>> + norm += y;
>>> }
>>>
>>> /* normalize so that an uniform color remains the same */
>>> switch(c->format){
>>> case AV_SAMPLE_FMT_S16P:
>>> - for(i=0;i<tap_count;i++)
>>> + for(i=0;i<tap_count;i++) {
>>> ((int16_t*)filter)[ph * alloc + i] = av_clip(lrintf(tab[i] * scale / norm), INT16_MIN, INT16_MAX);
>>> + }
>>> + if (tap_count % 2 == 0) {
>>> + for (i = 0; i < tap_count; i++)
>>> + ((int16_t*)filter)[(phase_count-ph) * alloc + tap_count-1-i] = ((int16_t*)filter)[ph * alloc + i];
>>> + }
>>> + else {
>>> + for (i = 1; i <= tap_count; i++)
>>> + ((int16_t*)filter)[(phase_count-ph) * alloc + tap_count-i] =
>>> + av_clip(lrintf(tab[i] * scale / (norm - tab[0] + tab[tap_count])), INT16_MIN, INT16_MAX);
>>> + }
>>> break;
>>> case AV_SAMPLE_FMT_S32P:
>>
>>> - for(i=0;i<tap_count;i++)
>>> + for(i=0;i<tap_count;i++) {
>>> ((int32_t*)filter)[ph * alloc + i] = av_clipl_int32(llrint(tab[i] * scale / norm));
>>> + }
>>
>> this and similar changes look uneeded
>>
>> otherwise LGTM
>
> Artifacts left over from testing, sorry. Will push later to make sure
> there are no further comments on this. I am particularly interested in
> comments from API users: does this speed matter to them? Clement
> seemed to care about it, but it seems to me quite rare to call this
> multiple times on a stream.
Put the "by 50%" into the top line of the commit message as well for
easy reference. Fixed extraneous braces, and pushed. Thanks.
>
>>
>> [...]
>>
>> --
>> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>>
>> Freedom in capitalist society always remains about the same as it was in
>> ancient Greek republics: Freedom for slave owners. -- Vladimir Lenin
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel at ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
More information about the ffmpeg-devel
mailing list