[FFmpeg-devel] [PATCH] sws/aarch64: add ff_hscale_8_to_15_neon
Ronald S. Bultje
rsbultje at gmail.com
Thu Mar 24 14:35:01 CET 2016
Hi,
On Mar 24, 2016 8:28 AM, "Clément Bœsch" <u at pkh.me> wrote:
>
> From: Clément Bœsch <clement at stupeflix.com>
>
> ./ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf
bench=start,scale=1024x1024,bench=stop -f null -
>
> before: t:0.489726 avg:0.489883 max:0.491852 min:0.489482
> after: t:0.256515 avg:0.256458 max:0.256999 min:0.253755
> ---
> Changes:
> - FIX: not using the v8-v15 registers
> - writing directly from the SIMD register (thx Martin)
> - misc reordering
>
> I'm looking at the vscale part now.
> ---
> libswscale/aarch64/Makefile | 6 +++--
> libswscale/aarch64/hscale.S | 59
+++++++++++++++++++++++++++++++++++++++++++
> libswscale/aarch64/swscale.c | 37 +++++++++++++++++++++++++++
> libswscale/swscale.c | 2 ++
> libswscale/swscale_internal.h | 1 +
> libswscale/utils.c | 4 ++-
> 6 files changed, 106 insertions(+), 3 deletions(-)
> create mode 100644 libswscale/aarch64/hscale.S
> create mode 100644 libswscale/aarch64/swscale.c
Do you intend to create special versions for specific filter widths (e.g.
x86 has special versions for filter_width=4 and 8). That helped speed up
the default filters (bicubic) a little more.
This version looks OK already for the default case.
Ronald
More information about the ffmpeg-devel
mailing list