[FFmpeg-devel] [PATCH 3/3] x86/hevc: add ff_hevc_sao_band_filter_{8, 10, 12}_{sse2, avx2}
Christophe Gisquet
christophe.gisquet at gmail.com
Sat Jan 31 11:33:53 CET 2015
Hi,
2015-01-30 19:50 GMT+01:00 James Almer <jamrial at gmail.com>:
> +%macro HEVC_SAO_BAND_FILTER_COMPUTE 3
> + psraw %2, %3, %1-5
> + pcmpeqw m10, %2, m0
> + pcmpeqw m11, %2, m1
> + pcmpeqw m12, %2, m2
> + pcmpeqw %2, m3
> + pand m10, m4
> + pand m11, m5
> + pand m12, m6
> + pand %2, m7
> + por m10, m11
> + por m12, %2
> + por m10, m12
> + paddw %3, m10
> +%endmacro
The shift does really force to work on bytes, too bad. Some pshufb
might still be possible using the result, but it would be cumbersome
because the psraw result is [0-31], and offset might be signed.
> +.loop:
> + movu m13, [srcq+widthq]
[...]
> + movu [dstq+widthq], m8
Some of those moves could be aligned, but there's some work to be done
at the buffer levels. So it's not like it's really part of this patch.
Looks good, any improvement seems like an additional patch.
--
Christophe
More information about the ffmpeg-devel
mailing list