[FFmpeg-devel] [Updated PATCH 3/3] vc-1: Optimise parser (with special attention to ARM)
Ben Avison
bavison at riscosopen.org
Wed Apr 23 15:59:19 CEST 2014
On Wed, 23 Apr 2014 03:26:22 +0100, Michael Niedermayer <michaelni at gmx.at> wrote:
> is it faster to do all the steps intermingled ?
> iam asking because the code should be simpler if it just uses
> the optimized start code search and optimized header parsing
> while maintaining the current structure
>
> for example the header parsing could be optmized like below:
OK, I've tried out your patch, and I also tried converting the start code
searches in find_next_marker and vc1_find_frame_end to use the fast
search function. The times (filtered to include only VC-1 functions) look
like this:
Before MN version MN + fast search BA version
M2TS 250.0 ± 11.2 160.9 ± 7.3 47.6 ± 9.0 27.2 ± 3.4
MKV 149.0 ± 12.8 70.8 ± 11.2 17.6 ± 4.7 1.7 ± 0.8
In other words, yes there still seems to be a significant speed
improvement from mixing the steps together. I suspect this comes down to
the fact that the buffers that are used with real-world streams tend to
bigger than even the L2 cache on the ARM11.
Ben
More information about the ffmpeg-devel
mailing list