[FFmpeg-devel] [PATCH 0/7] ARM NEON optimisations
Mans Rullgard
mans
Sat Dec 6 00:34:51 CET 2008
The following patches add NEON optimised versions of various dsputil
functions. I have improved some of the functions slightly since last
time I sent these patches. I am not including a few patches which can
be improved further this time.
I cannot show proof that faster implementations are impossible, so
please do not ask for this. Last time around, some people made vague
suggestions that improvements might be possible without being the
slightest bit helpful. I cannot read people's minds, so if there's
something I've overlooked, please tell me, but try to explain how it
might be done. By this I mean that I want something more than a
suggestion to compare with mmx code and see if any tricks can be
borrowed.
If I get no useful feedback, I will start applying these soon.
Mans Rullgard (7):
ARM: NEON optimised put_pixels functions
ARM: NEON optimised simple_idct
ARM: NEON optimised {put,avg}_h264_chroma_mc[48]
ARM: NEON optimised H.264 loop filter
ARM: NEON optimised H.264 8x8 and 16x16 qpel MC
ARM: NEON optimised h264_idct_add
ARM: NEON optimised h264_idct_dc_add
libavcodec/Makefile | 6 +
libavcodec/armv4l/dsputil_arm.c | 17 +
libavcodec/armv4l/dsputil_neon.c | 169 +++++
libavcodec/armv4l/dsputil_neon_s.S | 274 +++++++
libavcodec/armv4l/h264dsp_neon.S | 1367 ++++++++++++++++++++++++++++++++++
libavcodec/armv4l/h264idct_neon.S | 96 +++
libavcodec/armv4l/simple_idct_neon.S | 402 ++++++++++
libavcodec/avcodec.h | 1 +
libavcodec/utils.c | 1 +
9 files changed, 2333 insertions(+), 0 deletions(-)
create mode 100644 libavcodec/armv4l/dsputil_neon.c
create mode 100644 libavcodec/armv4l/dsputil_neon_s.S
create mode 100644 libavcodec/armv4l/h264dsp_neon.S
create mode 100644 libavcodec/armv4l/h264idct_neon.S
create mode 100644 libavcodec/armv4l/simple_idct_neon.S
More information about the ffmpeg-devel
mailing list