[FFmpeg-devel] [PATCH 0/6] Optimize HEVC decoding on ARM (32bit) platform
Shengbin Meng
shengbinmeng at gmail.com
Wed Nov 22 13:12:00 EET 2017
Our tests show that CPU clocks are reduced for each module:
~48% for qpel weight
~17% for epel
~71% for sao edge mode
~48% for sao band mode
~60% for idct of 16x16 block
And overall decoding speeds up by 20~30% (increase of FPS).
We also compared the decoding results to make sure they are the same
before and after the optimization.
These patches are based on the n3.4 release.
Meng Wang (5):
avcodec/hevcdsp: Add NEON optimization for qpel weighted mode
avcodec/hevcdsp: Add NEON optimization for epel
avcodec/hevcdsp: Use pre-load (pld) to optimize data loading
avcodec/hevcdsp: Add NEON optimization for sao
avcodec/hevcdsp: Add NEON optimization for idct16x16
Shengbin Meng (1):
avcodec/hevcdsp: Add NEON optimization for whole-pixel interpolation
libavcodec/arm/Makefile | 4 +-
libavcodec/arm/hevcdsp_epel_neon.S | 2078 ++++++++++++++++++++++++++++++++++++
libavcodec/arm/hevcdsp_idct_neon.S | 241 +++++
libavcodec/arm/hevcdsp_init_neon.c | 695 ++++++++++++
libavcodec/arm/hevcdsp_qpel_neon.S | 702 ++++++++++++
libavcodec/arm/hevcdsp_sao_neon.S | 181 ++++
6 files changed, 3900 insertions(+), 1 deletion(-)
create mode 100644 libavcodec/arm/hevcdsp_epel_neon.S
create mode 100644 libavcodec/arm/hevcdsp_sao_neon.S
--
2.13.6 (Apple Git-96)
More information about the ffmpeg-devel
mailing list