[FFmpeg-devel] [PATCH v3] aacenc: add SIMD optimizations for abs_pow34 and quantization
Michael Niedermayer
michael at niedermayer.cc
Tue Oct 18 16:51:22 EEST 2016
On Tue, Oct 18, 2016 at 09:02:19AM +0100, Rostislav Pehlivanov wrote:
> On 17 October 2016 at 23:43, Michael Niedermayer <michael at niedermayer.cc>
> wrote:
>
> > On Mon, Oct 17, 2016 at 10:24:48PM +0100, Rostislav Pehlivanov wrote:
> > > Should fix segfaults on x86-32
> > >
> > > Performance improvements:
> > >
> > > quant_bands:
> > > with: 681 decicycles in quant_bands, 8388453 runs, 155 skips
> > > without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips
> > > Around 42% for the function
> > >
> > > Twoloop coder:
> > >
> > > abs_pow34:
> > > with/without: 7.82s/8.17s
> > > Around 4% for the entire encoder
> > >
> > > Both:
> > > with/without: 7.15s/8.17s
> > > Around 12% for the entire encoder
> > >
> > > Fast coder:
> > >
> > > abs_pow34:
> > > with/without: 3.40s/3.77s
> > > Around 10% for the entire encoder
> > >
> > > Both:
> > > with/without: 3.02s/3.77s
> > > Around 20% faster for the entire encoder
> > >
> > > Signed-off-by: Rostislav Pehlivanov <atomnuker at gmail.com>
> > > ---
> > > libavcodec/aaccoder.c | 27 +++++++------
> > > libavcodec/aaccoder_trellis.h | 2 +-
> > > libavcodec/aaccoder_twoloop.h | 2 +-
> > > libavcodec/aacenc.c | 4 ++
> > > libavcodec/aacenc.h | 6 +++
> > > libavcodec/aacenc_is.c | 6 +--
> > > libavcodec/aacenc_ltp.c | 4 +-
> > > libavcodec/aacenc_pred.c | 6 +--
> > > libavcodec/aacenc_quantization.h | 4 +-
> > > libavcodec/aacenc_utils.h | 4 +-
> > > libavcodec/x86/Makefile | 2 +
> > > libavcodec/x86/aacencdsp.asm | 87 ++++++++++++++++++++++++++++++
> > ++++++++++
> > > libavcodec/x86/aacencdsp_init.c | 43 ++++++++++++++++++++
> > > 13 files changed, 170 insertions(+), 27 deletions(-)
> > > create mode 100644 libavcodec/x86/aacencdsp.asm
> > > create mode 100644 libavcodec/x86/aacencdsp_init.c
> >
> > fate passes on linux32/64 x86, mingw32/64 x86
> >
> > build fails on arm:
> >
> > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > `ff_aac_dsp_init_x86'
> > collect2: ld returned 1 exit status
> > make: *** [ffserver_g] Error 1
> > make: *** Waiting for unfinished jobs....
> > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > `ff_aac_dsp_init_x86'
> > collect2: ld returned 1 exit status
> > make: *** [ffprobe_g] Error 1
> > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > `ff_aac_dsp_init_x86'
> > collect2: ld returned 1 exit status
> > make: *** [ffmpeg_g] Error 1
> >
> > [...]
> > --
> > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> >
> > While the State exists there can be no freedom; when there is freedom there
> > will be no State. -- Vladimir Lenin
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> >
> Attaching a new version with the fixes from James Almer which should also
> fix non-x86 compilation
> aaccoder.c | 27 +++++++--------
> aaccoder_trellis.h | 2 -
> aaccoder_twoloop.h | 2 -
> aacenc.c | 4 ++
> aacenc.h | 6 +++
> aacenc_is.c | 6 +--
> aacenc_ltp.c | 4 +-
> aacenc_pred.c | 6 +--
> aacenc_quantization.h | 4 +-
> aacenc_utils.h | 2 -
> x86/Makefile | 2 +
> x86/aacencdsp.asm | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++
> x86/aacencdsp_init.c | 43 ++++++++++++++++++++++++
> 13 files changed, 170 insertions(+), 26 deletions(-)
> 84d67e14dbd62ef958a52a4027a8dff22f7480b6 0001-aacenc-add-SIMD-optimizations-for-abs_pow34-and-quan.patch
> From d92003e23d82bc40fd85712538983209a7704248 Mon Sep 17 00:00:00 2001
> From: Rostislav Pehlivanov <atomnuker at gmail.com>
> Date: Sat, 8 Oct 2016 15:59:14 +0100
> Subject: [PATCH] aacenc: add SIMD optimizations for abs_pow34 and quantization
>
> Performance improvements:
>
> quant_bands:
> with: 681 decicycles in quant_bands, 8388453 runs, 155 skips
> without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips
> Around 42% for the function
>
> Twoloop coder:
>
> abs_pow34:
> with/without: 7.82s/8.17s
> Around 4% for the entire encoder
>
> Both:
> with/without: 7.15s/8.17s
> Around 12% for the entire encoder
>
> Fast coder:
>
> abs_pow34:
> with/without: 3.40s/3.77s
> Around 10% for the entire encoder
>
> Both:
> with/without: 3.02s/3.77s
> Around 20% faster for the entire encoder
>
> Signed-off-by: Rostislav Pehlivanov <atomnuker at gmail.com>
> ---
> libavcodec/aaccoder.c | 27 ++++++------
> libavcodec/aaccoder_trellis.h | 2 +-
> libavcodec/aaccoder_twoloop.h | 2 +-
> libavcodec/aacenc.c | 4 ++
> libavcodec/aacenc.h | 6 +++
> libavcodec/aacenc_is.c | 6 +--
> libavcodec/aacenc_ltp.c | 4 +-
> libavcodec/aacenc_pred.c | 6 +--
> libavcodec/aacenc_quantization.h | 4 +-
> libavcodec/aacenc_utils.h | 2 +-
> libavcodec/x86/Makefile | 2 +
> libavcodec/x86/aacencdsp.asm | 88 ++++++++++++++++++++++++++++++++++++++++
> libavcodec/x86/aacencdsp_init.c | 43 ++++++++++++++++++++
> 13 files changed, 170 insertions(+), 26 deletions(-)
> create mode 100644 libavcodec/x86/aacencdsp.asm
> create mode 100644 libavcodec/x86/aacencdsp_init.c
still fails to build on arm-qemu:
it looks like you call a function thats just not there on non x86
missing if (ARCH_X86) or #if i assume
LD ffmpeg_g
libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
/home/michael/ffmpeg-git/ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to `ff_aac_dsp_init_x86'
collect2: ld returned 1 exit status
make: *** [ffmpeg_g] Error 1
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
No snowflake in an avalanche ever feels responsible. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20161018/704b33ba/attachment.sig>
More information about the ffmpeg-devel
mailing list