[FFmpeg-devel] [PATCH] [WIP] swr: rewrite resample_common/linear_float_sse/avx in yasm.
James Almer
jamrial at gmail.com
Sat Jun 21 02:06:43 CEST 2014
On 19/06/14 9:37 PM, Ronald S. Bultje wrote:
> DO NOT MERGE. Speed not tested, avx not yet tested.
> ---
> configure | 3 +-
> libswresample/resample_template.c | 12 +-
> libswresample/x86/Makefile | 1 +
> libswresample/x86/resample.asm | 327 +++++++++++++++++++++++++++++++++++
> libswresample/x86/resample_mmx.h | 118 -------------
> libswresample/x86/resample_x86_dsp.c | 34 ++--
> 6 files changed, 346 insertions(+), 149 deletions(-)
> create mode 100644 libswresample/x86/resample.asm
[...]
> +.inner_loop:
> + movu m1, [srcptrq+filter_lenq*4]
> + mulps m1, [filterq+filter_lenq*4]
> + addps m0, m1
> + add filter_lenq, mmsize/4
> + js .inner_loop
> +
> +%if cpuflag(avx)
> + vextractf128 xm1, m0, 0x1
> + addps xm0, xm1
> +%endif
> +
> + ; horizontal sum
> + movhlps xm1, xm0
> + addps xm0, xm1
> + movss xm1, xm0
> + shufps xm0, xm0, q0001
you can do shufps xm1, xm0, xm0, q0001 and remove the movss.
Same with linear.
More information about the ffmpeg-devel
mailing list