[FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.

Dan Parrot dan.parrot at mail.com
Wed Jun 29 20:09:41 CEST 2016


Here are execution times of SIMD and non-SIMD functions. The times were
obtained using SystemTap probes at functions' entry and return points.
The dataset used was fate-filter-pixfmts-scale.

SIMD versions have suffix _vsx:
========
yuy2ToY_c_vsx.
no. of calls: 864. min: 1880 ns. avg: 2014 ns. max: 29844 ns. total:
1740366 ns.
yuy2ToY_c.
no. of calls: 864. min: 2326 ns. avg: 2451 ns. max: 15950 ns. total:
2118226 ns.

yvy2ToUV_c_vsx.
no. of calls: 288. min: 1891 ns. avg: 1989 ns. max: 13644 ns. total:
573038 ns.
yvy2ToUV_c.
no. of calls: 288. min: 2089 ns. avg: 2131 ns. max: 2462 ns. total:
613813 ns.

rgbaToA_c_vsx.
no. of calls: 1152. min: 1975 ns. avg: 2123 ns. max: 31356 ns. total:
2446276 ns.
rgbaToA_c.
no. of calls: 1152. min: 2368 ns. avg: 2448 ns. max: 12496 ns. total:
2820401 ns.

uyvyToUV_c_vsx.
no. of calls: 288. min: 1901 ns. avg: 1932 ns. max: 2122 ns. total:
556697 ns.
uyvyToUV_c.
no. of calls: 288. min: 2088 ns. avg: 2129 ns. max: 2370 ns. total:
613202 ns.

uyvyToY_c_vsx.
no. of calls: 576. min: 1877 ns. avg: 1956 ns. max: 15821 ns. total:
1127222 ns.
uyvyToY_c.
no. of calls: 576. min: 2325 ns. avg: 2408 ns. max: 15332 ns. total:
1387168 ns.

nv12ToUV_c_vsx.
no. of calls: 144. min: 1869 ns. avg: 2006 ns. max: 15480 ns. total:
288867 ns.
nv12ToUV_c.
no. of calls: 144. min: 2101 ns. avg: 2273 ns. max: 19774 ns. total:
327432 ns.

abgrToA_c_vsx.
no. of calls: 1152. min: 1949 ns. avg: 2060 ns. max: 15496 ns. total:
2373206 ns.
abgrToA_c.
no. of calls: 1152. min: 2374 ns. avg: 2471 ns. max: 52452 ns. total:
2847044 ns.

yuy2ToUV_c_vsx.
no. of calls: 288. min: 1873 ns. avg: 1972 ns. max: 16608 ns. total:
568154 ns.
yuy2ToUV_c.
no. of calls: 288. min: 2087 ns. avg: 2123 ns. max: 2252 ns. total:
611621 ns.

nv21ToUV_c_vsx.
no. of calls: 144. min: 1879 ns. avg: 2019 ns. max: 14290 ns. total:
290860 ns.
nv21ToUV_c.
no. of calls: 144. min: 2098 ns. avg: 2233 ns. max: 14750 ns. total:
321692 ns.
=================




More information about the ffmpeg-devel mailing list