[FFmpeg-devel] [FFmpeg-commits] Implement a SIMD version of emulated_edge_mc() for x86.
Ronald S. Bultje
rsbultje
Tue Feb 8 20:30:58 CET 2011
Hi,
On Mon, Feb 7, 2011 at 7:51 AM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> On Mon, Feb 7, 2011 at 2:18 AM, Daniel Verkamp <daniel at drv.nu> wrote:
>> On Mon, Jan 31, 2011 at 7:01 PM, Ronald S. Bultje <git at ffmpeg.org> wrote:
>>> Module: ffmpeg
>>> Branch: master
>>> Commit: 81f2a3f4ffcc6935b8b8ada4954700b3f333ae4f
>>>
>>> Author: Ronald S. Bultje <rsbultje at gmail.com>
>>> Date: ? Mon Jan 31 20:55:56 2011 -0500
>>>
>>> Implement a SIMD version of emulated_edge_mc() for x86.
>>
>> This crashes on a mingw-w64 build run on Win7 x64:
>
> I'm not 100% surprised.
>
> If we want to continue to support win64, I need a win64 ssh login with
> complete mingw64+mingw32 installed, and I need a fate system for each
> also. Could you consider setting that up?
>
> As soon as I have a SSH login, I'll see if I can fix it. Almost
> certainly, the v_extend_15 is >128bytes. Switching some instructions
> or registers will fix it.
Indeed:
(gdb) disass 0x0000000000800a80
Dump of assembler code for function ff_emu_edge_core_sse.emuedge_v_extend_15:
0x0000000000800a80 <+0>: test %r9,%r9
0x0000000000800a83 <+3>: je 0x800aab
<ff_emu_edge_core_sse.emuedge_copy_body_15_loop>
0x0000000000800a85 <+5>: movq (%rdx),%mm0
0x0000000000800a88 <+8>: movd 0x8(%rdx),%mm1
0x0000000000800a8c <+12>: mov 0xc(%rdx),%r10w
0x0000000000800a91 <+17>: mov 0xe(%rdx),%al
[..]
Dump of assembler code for function
ff_emu_edge_core_sse.emuedge_extend_bottom_15_loop:
0x0000000000800aeb <+0>: movq %mm0,(%rcx)
0x0000000000800aee <+3>: movd %mm1,0x8(%rcx)
0x0000000000800af2 <+7>: mov %r9w,0xc(%rcx)
0x0000000000800af7 <+12>: mov %al,0xe(%rcx)
0x0000000000800afa <+15>: add %r8,%rcx
0x0000000000800afd <+18>: dec %rsi
0x0000000000800b00 <+21>: jne 0x800aeb
<ff_emu_edge_core_sse.emuedge_extend_bottom_15_loop>
[..]
Dump of assembler code for function
ff_emu_edge_core_sse.emuedge_v_extend_end_15:
0x0000000000800b02 <+0>: retq
It's 3 bytes too big, I'll have to sort out a way of making it a
couple of bytes smaller and then it should be OK. Probably using a
different register for r9w should be sufficient, if I can free some
other one for that task.
Ronald
More information about the ffmpeg-devel
mailing list