[FFmpeg-devel] [PATCH] lavc/aacenc_utils: unroll quantize_bands loop

Tue Mar 22 20:09:10 CET 2016

On 22 March 2016 at 17:33, Ganesh Ajjanagadde <gajjanag at gmail.com> wrote:

> On Sat, Mar 19, 2016 at 2:36 AM, Hendrik Leppkes <h.leppkes at gmail.com>
> wrote:
> > On Sat, Mar 19, 2016 at 3:27 AM, Ganesh Ajjanagadde <gajjanag at gmail.com>
> wrote:
> >> Yields speedup in quantize_bands, and non-negligible speedup in aac
> encoding overall.
> >>
> >> Sample benchmark (Haswell, -march=native + GCC):
> >> new:
> >>     [...]
> >>     553 decicycles in quantize_bands, 2097136 runs,     16 skips9x
> >>     554 decicycles in quantize_bands, 4194266 runs,     38 skips8x
> >>     559 decicycles in quantize_bands, 8388534 runs,     74 skips7x
> >>
> >> old:
> >>     [...]
> >>     711 decicycles in quantize_bands, 2097140 runs,     12 skips7x
> >>     713 decicycles in quantize_bands, 4194277 runs,     27 skips4x
> >>     715 decicycles in quantize_bands, 8388538 runs,     70 skips3x
> >>
> >> old:
> >> ffmpeg -f lavfi -i anoisesrc -t 300 -y sin_new.aac  4.58s user 0.01s
> system 99% cpu 4.590 total
> >>
> >> new:
> >> ffmpeg -f lavfi -i anoisesrc -t 300 -y sin_new.aac  4.54s user 0.02s
> system 99% cpu 4.566 total
> >>
> >> Signed-off-by: Ganesh Ajjanagadde <gajjanag at gmail.com>
> >> ---
> >>  libavcodec/aacenc_utils.h | 33 +++++++++++++++++++++++++--------
> >>  1 file changed, 25 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/libavcodec/aacenc_utils.h b/libavcodec/aacenc_utils.h
> >> index 38636e5..0203b6e 100644
> >> --- a/libavcodec/aacenc_utils.h
> >> +++ b/libavcodec/aacenc_utils.h
> >> @@ -62,18 +62,35 @@ static inline int quant(float coef, const float Q,
> const float rounding)
> >>      return sqrtf(a * sqrtf(a)) + rounding;
> >>  }
> >>
> >> +static inline float minf(float x, float y) {
> >> +    return x < y ? x : y;
> >> +}
> >> +
> >
> > Thats exactly what the FFMIN macro expands to, whats the reason for
> > introducing this function?
>
> There was some compilation difference, in particular this was faster.
> No idea why, maybe some repeated evaluation of qc + rounding?
>
>
"No idea why" is not even remotely a valid excuse to have your own function
which does exactly what FFMIN does.
Also the bracket should be on a newline.