[FFmpeg-devel] [PATCH v3] aacenc: add SIMD optimizations for abs_pow34 and quantization
Rostislav Pehlivanov
atomnuker at gmail.com
Tue Oct 18 18:07:25 EEST 2016
On 18 October 2016 at 14:51, Michael Niedermayer <michael at niedermayer.cc>
wrote:
> On Tue, Oct 18, 2016 at 09:02:19AM +0100, Rostislav Pehlivanov wrote:
> > On 17 October 2016 at 23:43, Michael Niedermayer <michael at niedermayer.cc
> >
> > wrote:
> >
> > > On Mon, Oct 17, 2016 at 10:24:48PM +0100, Rostislav Pehlivanov wrote:
> > > > Should fix segfaults on x86-32
> > > >
> > > > Performance improvements:
> > > >
> > > > quant_bands:
> > > > with: 681 decicycles in quant_bands, 8388453 runs, 155 skips
> > > > without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips
> > > > Around 42% for the function
> > > >
> > > > Twoloop coder:
> > > >
> > > > abs_pow34:
> > > > with/without: 7.82s/8.17s
> > > > Around 4% for the entire encoder
> > > >
> > > > Both:
> > > > with/without: 7.15s/8.17s
> > > > Around 12% for the entire encoder
> > > >
> > > > Fast coder:
> > > >
> > > > abs_pow34:
> > > > with/without: 3.40s/3.77s
> > > > Around 10% for the entire encoder
> > > >
> > > > Both:
> > > > with/without: 3.02s/3.77s
> > > > Around 20% faster for the entire encoder
> > > >
> > > > Signed-off-by: Rostislav Pehlivanov <atomnuker at gmail.com>
> > > > ---
> > > > libavcodec/aaccoder.c | 27 +++++++------
> > > > libavcodec/aaccoder_trellis.h | 2 +-
> > > > libavcodec/aaccoder_twoloop.h | 2 +-
> > > > libavcodec/aacenc.c | 4 ++
> > > > libavcodec/aacenc.h | 6 +++
> > > > libavcodec/aacenc_is.c | 6 +--
> > > > libavcodec/aacenc_ltp.c | 4 +-
> > > > libavcodec/aacenc_pred.c | 6 +--
> > > > libavcodec/aacenc_quantization.h | 4 +-
> > > > libavcodec/aacenc_utils.h | 4 +-
> > > > libavcodec/x86/Makefile | 2 +
> > > > libavcodec/x86/aacencdsp.asm | 87 ++++++++++++++++++++++++++++++
> > > ++++++++++
> > > > libavcodec/x86/aacencdsp_init.c | 43 ++++++++++++++++++++
> > > > 13 files changed, 170 insertions(+), 27 deletions(-)
> > > > create mode 100644 libavcodec/x86/aacencdsp.asm
> > > > create mode 100644 libavcodec/x86/aacencdsp_init.c
> > >
> > > fate passes on linux32/64 x86, mingw32/64 x86
> > >
> > > build fails on arm:
> > >
> > > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > > `ff_aac_dsp_init_x86'
> > > collect2: ld returned 1 exit status
> > > make: *** [ffserver_g] Error 1
> > > make: *** Waiting for unfinished jobs....
> > > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > > `ff_aac_dsp_init_x86'
> > > collect2: ld returned 1 exit status
> > > make: *** [ffprobe_g] Error 1
> > > libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> > > ffmpeg/arm/src/libavcodec/aacenc.c:1038: undefined reference to
> > > `ff_aac_dsp_init_x86'
> > > collect2: ld returned 1 exit status
> > > make: *** [ffmpeg_g] Error 1
> > >
> > > [...]
> > > --
> > > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC7
> 87040B0FAB
> > >
> > > While the State exists there can be no freedom; when there is freedom
> there
> > > will be no State. -- Vladimir Lenin
> > >
> > > _______________________________________________
> > > ffmpeg-devel mailing list
> > > ffmpeg-devel at ffmpeg.org
> > > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > >
> > >
> > Attaching a new version with the fixes from James Almer which should also
> > fix non-x86 compilation
>
> > aaccoder.c | 27 +++++++--------
> > aaccoder_trellis.h | 2 -
> > aaccoder_twoloop.h | 2 -
> > aacenc.c | 4 ++
> > aacenc.h | 6 +++
> > aacenc_is.c | 6 +--
> > aacenc_ltp.c | 4 +-
> > aacenc_pred.c | 6 +--
> > aacenc_quantization.h | 4 +-
> > aacenc_utils.h | 2 -
> > x86/Makefile | 2 +
> > x86/aacencdsp.asm | 88 ++++++++++++++++++++++++++++++
> ++++++++++++++++++++
> > x86/aacencdsp_init.c | 43 ++++++++++++++++++++++++
> > 13 files changed, 170 insertions(+), 26 deletions(-)
> > 84d67e14dbd62ef958a52a4027a8dff22f7480b6 0001-aacenc-add-SIMD-
> optimizations-for-abs_pow34-and-quan.patch
> > From d92003e23d82bc40fd85712538983209a7704248 Mon Sep 17 00:00:00 2001
> > From: Rostislav Pehlivanov <atomnuker at gmail.com>
> > Date: Sat, 8 Oct 2016 15:59:14 +0100
> > Subject: [PATCH] aacenc: add SIMD optimizations for abs_pow34 and
> quantization
> >
> > Performance improvements:
> >
> > quant_bands:
> > with: 681 decicycles in quant_bands, 8388453 runs, 155 skips
> > without: 1190 decicycles in quant_bands, 8388386 runs, 222 skips
> > Around 42% for the function
> >
> > Twoloop coder:
> >
> > abs_pow34:
> > with/without: 7.82s/8.17s
> > Around 4% for the entire encoder
> >
> > Both:
> > with/without: 7.15s/8.17s
> > Around 12% for the entire encoder
> >
> > Fast coder:
> >
> > abs_pow34:
> > with/without: 3.40s/3.77s
> > Around 10% for the entire encoder
> >
> > Both:
> > with/without: 3.02s/3.77s
> > Around 20% faster for the entire encoder
> >
> > Signed-off-by: Rostislav Pehlivanov <atomnuker at gmail.com>
> > ---
> > libavcodec/aaccoder.c | 27 ++++++------
> > libavcodec/aaccoder_trellis.h | 2 +-
> > libavcodec/aaccoder_twoloop.h | 2 +-
> > libavcodec/aacenc.c | 4 ++
> > libavcodec/aacenc.h | 6 +++
> > libavcodec/aacenc_is.c | 6 +--
> > libavcodec/aacenc_ltp.c | 4 +-
> > libavcodec/aacenc_pred.c | 6 +--
> > libavcodec/aacenc_quantization.h | 4 +-
> > libavcodec/aacenc_utils.h | 2 +-
> > libavcodec/x86/Makefile | 2 +
> > libavcodec/x86/aacencdsp.asm | 88 ++++++++++++++++++++++++++++++
> ++++++++++
> > libavcodec/x86/aacencdsp_init.c | 43 ++++++++++++++++++++
> > 13 files changed, 170 insertions(+), 26 deletions(-)
> > create mode 100644 libavcodec/x86/aacencdsp.asm
> > create mode 100644 libavcodec/x86/aacencdsp_init.c
>
> still fails to build on arm-qemu:
> it looks like you call a function thats just not there on non x86
> missing if (ARCH_X86) or #if i assume
>
> LD ffmpeg_g
> libavcodec/libavcodec.a(aacenc.o): In function `aac_encode_init':
> /home/michael/ffmpeg-git/ffmpeg/arm/src/libavcodec/aacenc.c:1038:
> undefined reference to `ff_aac_dsp_init_x86'
> collect2: ld returned 1 exit status
> make: *** [ffmpeg_g] Error 1
>
> [...]
>
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> No snowflake in an avalanche ever feels responsible. -- Voltaire
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
Damn, forgot to amend the patch with that change, attached should finally
fix it
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-aacenc-add-SIMD-optimizations-for-abs_pow34-and-quan.patch
Type: text/x-patch
Size: 19976 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20161018/1820aad0/attachment.bin>
More information about the ffmpeg-devel
mailing list