[FFmpeg-devel] [PATCH/RFC] Add some dsputil functions useful for AAC decoder
Måns Rullgård
mans
Sun Sep 20 14:39:40 CEST 2009
Michael Niedermayer <michaelni at gmx.at> writes:
> On Fri, Sep 18, 2009 at 11:11:55PM +0100, Mans Rullgard wrote:
>> This patch adds a few dsputil functions that can be used in the AAC
>> decoder.
>>
>> With trivial NEON versions of these functions, the AAC decoder gets
>> ~1.6x faster on Cortex-A8, and better NEON code will push that even
>> further.
>>
>> I will readily admit that some of the names in this patch are rubbish,
>> so please suggest something better. Other enhancements are obviously
>> welcome too.
> [...]
>
>> diff --git a/libavcodec/dsputil.h b/libavcodec/dsputil.h
>> index d9d7d16..61252f5 100644
>> --- a/libavcodec/dsputil.h
>> +++ b/libavcodec/dsputil.h
>> @@ -397,6 +397,14 @@ typedef struct DSPContext {
>> /* assume len is a multiple of 8, and arrays are 16-byte aligned */
>> void (*int32_to_float_fmul_scalar)(float *dst, const int *src, float mul, int len);
>> void (*vector_clipf)(float *dst /* align 16 */, const float *src /* align 16 */, float min, float max, int len /* align 16 */);
>> + void (*vector_fmul_scalar)(float *dst, const float *src, float mul,
>> + int len);
>> + void (*vector_fmul_scalar_vp[2])(float *dst, const float *src,
>> + const float **vp, float mul, int len);
>> + void (*vp_fmul_scalar[2])(float *dst, const float **vp,
>> + float mul, int len);
>> + float (*scalarproduct_float)(const float *v1, const float *v2, int len);
>> + void (*butterflies_float)(float *v1, float *v2, int len);
>
> missing doxy
I don't want to waste time on documentation before the general idea is
approved.
> also, without seeing how these all are used i do have the feeling that
> they maybe are too small primitives and that bigger chunks of aac code
> should be optimized to increase flexibility and reduce call overhead ...
See attached patch.
> and i would suggest to only optimize code when it matters speedwise and
> not when the code just makes up <1% of the cpu time, alex reply made
> me think that this may apply to some code in there ...
1.6x speedup matters to me.
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list