[FFmpeg-devel] [PATCH] lavc/aacenc_utils: unroll abs_pow34_v loop
Ganesh Ajjanagadde
gajjanag at gmail.com
Sat Mar 19 06:12:14 CET 2016
It seems like in all usages, size is a multiple of 4. This is documented
as an assert.
Yields speedup in this function, and small speedup for aac encoding overall.
Sample benchmark (Haswell, -march=native + GCC):
old:
[...]
1390 decicycles in abs_pow34_v, 127138 runs, 3934 skips63.1x
1385 decicycles in abs_pow34_v, 254191 runs, 7953 skips64.4x
1383 decicycles in abs_pow34_v, 508305 runs, 15983 skips65.3x
new:
[...]
1109 decicycles in abs_pow34_v, 127122 runs, 3950 skips61.2x
1107 decicycles in abs_pow34_v, 254177 runs, 7967 skips63.5x
1106 decicycles in abs_pow34_v, 508292 runs, 15996 skips65.3x
old:
ffmpeg -f lavfi -i anoisesrc -t 300 -y sin_new.aac 4.55s user 0.03s system 99% cpu 4.581 total
new:
ffmpeg -f lavfi -i anoisesrc -t 300 -y sin_new.aac 4.50s user 0.04s system 99% cpu 4.537 total
Signed-off-by: Ganesh Ajjanagadde <gajjanag at gmail.com>
---
libavcodec/aacenc_utils.h | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/libavcodec/aacenc_utils.h b/libavcodec/aacenc_utils.h
index 0203b6e..800b78f 100644
--- a/libavcodec/aacenc_utils.h
+++ b/libavcodec/aacenc_utils.h
@@ -37,20 +37,26 @@
#define ROUND_TO_ZERO 0.1054f
#define C_QUANT 0.4054f
-static inline void abs_pow34_v(float *av_restrict out, const float *av_restrict in, const int size)
-{
- int i;
- for (i = 0; i < size; i++) {
- float a = fabsf(in[i]);
- out[i] = sqrtf(a * sqrtf(a));
- }
-}
-
static inline float pos_pow34(float a)
{
return sqrtf(a * sqrtf(a));
}
+static inline void abs_pow34_v(float *av_restrict out, const float *av_restrict in, const int size)
+{
+ av_assert2(!(size % 4));
+ for (int i = 0; i < size; i+=4) {
+ float a0 = fabsf(in[i]);
+ float a1 = fabsf(in[i+1]);
+ float a2 = fabsf(in[i+2]);
+ float a3 = fabsf(in[i+3]);
+ out[i ] = pos_pow34(a0);
+ out[i+1] = pos_pow34(a1);
+ out[i+2] = pos_pow34(a2);
+ out[i+3] = pos_pow34(a3);
+ }
+}
+
/**
* Quantize one coefficient.
* @return absolute value of the quantized coefficient
--
2.7.3
More information about the ffmpeg-devel
mailing list