[FFmpeg-devel] [PATCH v3 2/5] avcodec/v210enc: make 8bit and 10bit function consistent
Michael Niedermayer
michael at niedermayer.cc
Mon Sep 16 22:06:06 EEST 2019
On Sun, Sep 01, 2019 at 09:20:20PM +0800, lance.lmwang at gmail.com wrote:
> From: Limin Wang <lance.lmwang at gmail.com>
>
> I have benchmarked the performance with c code and haven't see any
> performance impact.
>
> Signed-off-by: Limin Wang <lance.lmwang at gmail.com>
> ---
> libavcodec/v210enc.c | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/libavcodec/v210enc.c b/libavcodec/v210enc.c
> index 1b840b2..69a2efe 100644
> --- a/libavcodec/v210enc.c
> +++ b/libavcodec/v210enc.c
> @@ -43,12 +43,7 @@ static void v210_planar_pack_8_c(const uint8_t *y, const uint8_t *u,
> uint32_t val;
> int i;
>
> - /* unroll this to match the assembly */
> - for (i = 0; i < width - 11; i += 12) {
> - WRITE_PIXELS(u, y, v, 8);
> - WRITE_PIXELS(y, u, y, 8);
> - WRITE_PIXELS(v, y, u, 8);
> - WRITE_PIXELS(y, v, y, 8);
> + for (i = 0; i < width - 5; i += 6) {
> WRITE_PIXELS(u, y, v, 8);
> WRITE_PIXELS(y, u, y, 8);
> WRITE_PIXELS(v, y, u, 8);
I have retested this with START/STOP_TIMER
and the more unrolled loop is consistently faster
./ffmpeg -cpuflags 0 -v 99 -i matrixbench_mpeg2.mpg -vcodec v210 -an test.avi
31620 decicycles in TEST, 2096691 runs, 461 skips 0 0 0 0 0 0 0 0 0 0 0 21 13 9 8 7 8 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0
31509 decicycles in TEST, 2096892 runs, 260 skips 0 0 0 0 0 0 0 0 0 0 0 21 10 9 8 6 7 3 2 0 0 0 0 0 0 0 0 0 0 0 0 0
32069 decicycles in TEST, 2096965 runs, 187 skips 0 0 0 0 0 0 0 0 0 0 0 21 16 10 8 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
31522 decicycles in TEST, 2096962 runs, 190 skips 0 0 0 0 0 0 0 0 0 0 0 21 10 9 8 6 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
31537 decicycles in TEST, 2096784 runs, 368 skips 0 0 0 0 0 0 0 0 0 0 0 21 12 8 9 7 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0
prev:
30705 decicycles in TEST, 2096875 runs, 277 skips 0 0 0 0 0 0 0 0 0 0 0 21 15 9 9 7 5 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0
30771 decicycles in TEST, 2096907 runs, 245 skips 0 0 0 0 0 0 0 0 0 0 0 21 15 9 8 6 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
30560 decicycles in TEST, 2096904 runs, 248 skips 0 0 0 0 0 0 0 0 0 0 0 21 10 9 9 6 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
31020 decicycles in TEST, 2096974 runs, 178 skips 0 0 0 0 0 0 0 0 0 0 0 21 16 9 8 6 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
31018 decicycles in TEST, 2096980 runs, 172 skips 0 0 0 0 0 0 0 0 0 0 0 21 16 9 8 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I have often repented speaking, but never of holding my tongue.
-- Xenocrates
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20190916/38cd7036/attachment.sig>
More information about the ffmpeg-devel
mailing list