[FFmpeg-devel] [PATCH 1/2] swresample: Refactor resample asm and port it to yasm
James Almer
jamrial at gmail.com
Thu Mar 20 05:18:05 CET 2014
On 20/03/14 12:29 AM, Michael Niedermayer wrote:
> On Thu, Mar 20, 2014 at 04:04:02AM +0100, Michael Niedermayer wrote:
>> On Wed, Mar 19, 2014 at 10:16:17PM -0300, James Almer wrote:
>>> On 19/03/14 9:08 PM, Michael Niedermayer wrote:
>>>> On Wed, Mar 19, 2014 at 06:45:03PM -0300, James Almer wrote:
>>>>> This reduces code duplication and makes it easier to implement new asm
>>>>> functions in the future
>>>>>
>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>>> ---
>>>>> libswresample/resample.c | 96 ++++++++++---------------------------
>>>>> libswresample/resample_template.c | 49 +++++++------------
>>>>> libswresample/swresample_internal.h | 24 ++++++++++
>>>>> libswresample/x86/Makefile | 1 +
>>>>> libswresample/x86/resample.asm | 64 +++++++++++++++++++++++++
>>>>> libswresample/x86/resample_mmx.h | 74 ----------------------------
>>>>> libswresample/x86/swresample_x86.c | 16 +++++++
>>>>> 7 files changed, 148 insertions(+), 176 deletions(-)
>>>>> create mode 100644 libswresample/x86/resample.asm
>>>>> delete mode 100644 libswresample/x86/resample_mmx.h
>>>>
>>>> benchmark:
>>>>
>>>> before: 253482 decicycles in resample, 1024 runs, 0 skips
>>>> after 356545 decicycles in resample, 1024 runs, 0 skips
>>>>
>>>> tested using ffplay HAYLEY\ WESTENRA-WHISPERS\ IN\ A\ DREAM.webm -af aformat=s32,aresample=48000,aformat=s32
>>>>
>>>>
>>>
>>> Where did you put the timer.h macros? I put them at the beginning and end of
>>> the swri_resample_<sampleformat> function/macro in resample_template.c
>>
>> i had them in the loop in multiple_resample()
>>
>>
>>> And what about 16bits 44100khz to 16 bits 22050khz (using the sse2 code), which
>>> is the one i tried and where i noticed a boost?
>>
>> i didnt try that one
>
> with a random mp3
>
> before: 47905 decicycles in resample, 1023 runs, 1 skips
> after: 55962 decicycles in resample, 1023 runs, 1 skips
>
> and a pcm s16 wav results in similar values
>
> maybe gcc generates some really dumb code out of the inline asm
> for you
Placing the timer macros in the multiple_resample() loop turned the tables
for this one on my end. So it seems the whole thing is indeed slower the
further back you go.
You could try placing the macros inside the relevant resample_template.c
function and test this one case again just to be sure it's not my gcc.
Anyway, patch dropped for the time being. It evidently needs some better
refactoring.
I'll send a patch for the float sse in inline form in a moment.
More information about the ffmpeg-devel
mailing list