[FFmpeg-devel] Subject: Re: swscale-test segfault with 64-bit icc 11.1
Winterton, Richard
richard.winterton
Wed Jul 21 01:08:32 CEST 2010
Hi,
I believe I was able to duplicate the issue described replicating the segment fault with a small snippet. I checked with a compiler engineer at and he replied with the following:
The compiler is unable to detect which stack spaces the users uses in inlined asm, and avoid them. As a workaround, you can use -mno-red-zone to disable the optimization where we use the lower part of ESP in leaf functions, but this will disable red-zone for all other leaf functions also, and may cost performance.
I can look into a modification of the assembly to work around the problem if you still have the issue.
Thanks
Rich
> On Sat, Jul 17, 2010 at 04:50:10PM -0300, Ramiro Polla wrote:
> Hi,
>
> swscale-test segfaults when built with 64-bit icc 11.1 (20100414). The
> function that fails is hyscale_fast_MMX2(). Here's a disassembly of
> the function:
> a4b0: 53 push %rbx
> a4b1: 48 8b 87 c8 30 00 00 mov 0x30c8(%rdi),%rax
> a4b8: 4c 8b 9f a8 30 00 00 mov 0x30a8(%rdi),%r11
> a4bf: 48 89 74 24 d8 mov %rsi,-0x28(%rsp)
> a4c4: 45 89 ca mov %r9d,%r10d
> a4c7: 48 89 54 24 e0 mov %rdx,-0x20(%rsp)
> a4cc: 41 f7 da neg %r10d
> a4cf: 83 bf 10 31 00 00 00 cmpl $0x0,0x3110(%rdi)
> a4d6: 48 89 4c 24 e8 mov %rcx,-0x18(%rsp)
> a4db: 48 89 44 24 d0 mov %rax,-0x30(%rsp)
> a4e0: 48 8b 87 00 31 00 00 mov 0x3100(%rdi),%rax
> a4e7: 4c 89 5c 24 f0 mov %r11,-0x10(%rsp)
> a4ec: 48 89 44 24 f8 mov %rax,-0x8(%rsp)
> a4f1: 0f 84 05 01 00 00 je a5fc <hyscale_fast_MMX2+0x14c>
> a4f7: 0f ef ff pxor %mm7,%mm7
> a4fa: 48 8b 4c 24 e8 mov -0x18(%rsp),%rcx
> a4ff: 48 8b 7c 24 d8 mov -0x28(%rsp),%rdi
> a504: 48 8b 54 24 f0 mov -0x10(%rsp),%rdx
> a509: 48 8b 5c 24 d0 mov -0x30(%rsp),%rbx
> a50e: 48 31 c0 xor %rax,%rax
> a511: 0f 18 01 prefetchnta (%rcx)
> a514: 0f 18 41 20 prefetchnta 0x20(%rcx)
> a518: 0f 18 41 40 prefetchnta 0x40(%rcx)
> a51c: 8b 33 mov (%rbx),%esi
> a51e: ff 54 24 f8 callq *-0x8(%rsp)
> a522: 8b 34 03 mov (%rbx,%rax,1),%esi
> a525: 48 01 f1 add %rsi,%rcx
> a528: 48 01 c7 add %rax,%rdi
> a52b: 48 31 c0 xor %rax,%rax
> a52e: 8b 33 mov (%rbx),%esi
> a530: ff 54 24 f8 callq *-0x8(%rsp)
> [...]
>
> Since no functions are being called in C inside hyscale_fast_MMX2(),
> the compiler decides it's ok to use -0x8(%rsp) instead of properly
> sub'ing rsp, as it supposedly won't get overwritten. But in this case
> we call the mmx2 code inside asm, overwriting -0x8(%rsp). The second
> callq goes to a522, and when run again, it tries to run some random
> code that was the next pointer on the stack. gcc does the same thing,
> but it seems it leaves -0x8(%rsp) alone and uses the stack -0x10(%rsp)
> and below.
>
> Is this a compiler bug (as in should it detect a call inside asm)?
> Could (or should) we hint to the compiler that a call is being made
> inside the asm block (I don't even know if this is possible)?
I would suggest that you ask intel (after checking the manual).
its surely possible to workaround this in various ways but this
feels unclean.
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
The bravest are surely those who have the clearest vision
of what is before them, glory and danger alike, and yet
notwithstanding go out to meet it. -- Thucydides
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iD8DBQFMQmXSYR7HhwQLD6sRApN0AJ9GTPxfdwZr981F7vDAchoiAn6IIACfeFKe
lb+KY9Z+gexEnBt9+yJqvBs=
=z9qV
-----END PGP SIGNATURE-----
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel at mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel
More information about the ffmpeg-devel
mailing list