[FFmpeg-devel] [PATCH] PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 VSX SIMD.
Dan Parrot
dan.parrot at mail.com
Mon Jul 4 21:18:48 EEST 2016
On Mon, 2016-07-04 at 16:30 +0000, Carl Eugen Hoyos wrote:
> Dan Parrot <dan.parrot <at> mail.com> writes:
>
> > > Did you test if using ffmpeg -benchmark -f rawvideo -i /dev/zero...
> > > showed different results?
> > > I believe this should be both easier and faster to test.
> >
> > Sorry, I don't understand what that command line just above
> > is trying to achieve. Could you elaborate?
>
> Instead of running the whole fate suite that takes long and
> does not test libswscale for most commands, just test an
> ffmpeg command line that only tests libswscale:
> $ ffmpeg -benchmark -f rawvideo -pix_fmt rgb24
> -i /dev/zero -pix_fmt yuv420p -f null -vframes 10000 -
> vs
>
> $ ffmpeg -cpuflags 0 -benchmark -f rawvideo -pix_fmt rgb24
> -i /dev/zero -pix_fmt yuv420p -f null -vframes 10000 -
>
Ok. Thanks for the explanation. I will run those commands and post the
reported results.
> [...]
>
> > Surprisingly, gcc is producing some badly suboptimal assembly.
>
> Just to make sure I don't misunderstand:
> Does this mean intrinsics are suboptimal to write assembly
> code?
Here's what I mean: All variables below are of type "vector int"
1. v0 = v2 * v3
2. v0 = v4 * v5 + v6 * v7 + v8 * v9
The first statement produces 1 multiply, 1 multiply-sum and 1 addition
instruction in assembly.
The second produces 6 multiply, 6 multiply-sum, and 10 addition
instructions in assembly! I expected 3, 3, 3 of each respective
operations from (1) plus 2 additions.
>
> > > Can you confirm with START_TIMER / STOP_TIMER that there is no
> > > gain?
> >
> > SystemTap probes provide identical functionality by measuring
> > deltas between function entry and function return.
>
> Sorry, I don't understand:
> Did you test with both methods to verify that they provide
> the same results?
>
> Note that if it turns out that START_TIMER / STOP_TIMER
> cannot be used on ppc64 (le) this would be important
> information for us.
>
I'll insert these macros and inform of the results if the code compiles
and runs.
More information about the ffmpeg-devel
mailing list