[FFmpeg-devel] [PATCH] avfilter/vf_gblur: add x86 SIMD optimizations
Song, Ruiling
ruiling.song at intel.com
Thu May 30 10:29:25 EEST 2019
> -----Original Message-----
> From: Paul B Mahol [mailto:onemda at gmail.com]
> Sent: Thursday, May 30, 2019 3:24 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Cc: Song, Ruiling <ruiling.song at intel.com>
> Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_gblur: add x86 SIMD
> optimizations
>
> On 5/30/19, Ruiling Song <ruiling.song at intel.com> wrote:
> > For details of the implementation, please refer to the comment
> > inlined in the assembly code. It improves the horizontal pass
> > performance about 100% under single thread.
> >
> > Tested overall performance using the command(avx2 enabled):
> > ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null
> > ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null
> > For single thread, the fps improves from 43 to 60, about 40%.
> > For multi-thread, the fps improves from 110 to 130, about 20%.
> >
> > Signed-off-by: Ruiling Song <ruiling.song at intel.com>
> > ---
> > libavfilter/gblur.h | 54 ++++++++++
> > libavfilter/vf_gblur.c | 66 +++++-------
> > libavfilter/x86/Makefile | 2 +
> > libavfilter/x86/vf_gblur.asm | 182
> ++++++++++++++++++++++++++++++++
> > libavfilter/x86/vf_gblur_init.c | 36 +++++++
> > 5 files changed, 302 insertions(+), 38 deletions(-)
> > create mode 100644 libavfilter/gblur.h
> > create mode 100644 libavfilter/x86/vf_gblur.asm
> > create mode 100644 libavfilter/x86/vf_gblur_init.c
[...]
> > diff --git a/libavfilter/vf_gblur.c b/libavfilter/vf_gblur.c
> > index b91a8c074a..4e876bca05 100644
> > --- a/libavfilter/vf_gblur.c
> > +++ b/libavfilter/vf_gblur.c
> > @@ -30,29 +30,11 @@
> > #include "libavutil/pixdesc.h"
> > #include "avfilter.h"
> > #include "formats.h"
> > +#include "gblur.h"
> > #include "internal.h"
> > #include "video.h"
> > +#include <immintrin.h>
>
> Is this header really needed?
Oh, this is not needed, I forget to remove it after I am experimenting with SSE intrinsics.
Will remove it. Thanks!
Ruiling
More information about the ffmpeg-devel
mailing list