[FFmpeg-devel] [PATCH 0/6] More H.264 assembly (the sequel)
James Darnley
jdarnley at obe.tv
Thu Dec 1 18:57:43 EET 2016
Some more assembly for review. This time we have 10-bit h chroma functions.
The intra ones have some strange benchmark results. Overall the improvement
isn't that large, particularly for the 4:2:0 intra. And for the avx version of
that function it is slower than the sse2, by quite a margin. I will definitely
try benchmarking it on my Nehalem after sending these emails.
Suggestions greatly appreciated.
James Darnley (6):
avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter
avcodec/h264: clean up and expand x86 function definitions
whitespace changes after last commit
avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma deblock/loop
filter
avcodec/h264: mmx2, sse2, avx 10-bit h chroma intra deblock/loop
filter
avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma intra deblock/loop
filter
libavcodec/x86/h264_deblock_10bit.asm | 213 ++++++++++++++++++++++++++++++++++
libavcodec/x86/h264dsp_init.c | 74 ++++++++----
2 files changed, 262 insertions(+), 25 deletions(-)
--
2.10.2
More information about the ffmpeg-devel
mailing list