[FFmpeg-devel] [PATCH] aacenc_utils: unroll loops to allow compiler to use SIMD.
James Almer
jamrial at gmail.com
Sun Mar 6 19:49:00 CET 2016
On 3/6/2016 3:35 PM, Reimar Döffinger wrote:
> Approximately 10% faster transcode from mp3 to aac
> with default settings.
>
> Signed-off-by: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
> ---
> libavcodec/aacenc_utils.h | 47 ++++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 38 insertions(+), 9 deletions(-)
>
> diff --git a/libavcodec/aacenc_utils.h b/libavcodec/aacenc_utils.h
> index b9bd6bf..1639021 100644
> --- a/libavcodec/aacenc_utils.h
> +++ b/libavcodec/aacenc_utils.h
> @@ -36,15 +36,29 @@
> #define ROUND_TO_ZERO 0.1054f
> #define C_QUANT 0.4054f
>
> +#define ABSPOW(inv, outv) \
> +do { \
> + float a = (inv); \
> + a = fabsf(a); \
> + (outv) = sqrtf(a * sqrtf(a)); \
> +} while(0)
> +
> static inline void abs_pow34_v(float *out, const float *in, const int size)
> {
> int i;
> - for (i = 0; i < size; i++) {
> - float a = fabsf(in[i]);
> - out[i] = sqrtf(a * sqrtf(a));
> + for (i = 0; i < size - 3; i += 4) {
> + ABSPOW(in[i], out[i]);
> + ABSPOW(in[i+1], out[i+1]);
> + ABSPOW(in[i+2], out[i+2]);
> + ABSPOW(in[i+3], out[i+3]);
> + }
Are you sure this wasn't vectorized already? I remember i checked and it mostly
was, at least on gcc 5.3 mingw-w64 with default settings.
More information about the ffmpeg-devel
mailing list