[FFmpeg-devel] swscale-test segfault with 64-bit icc 11.1
Ramiro Polla
ramiro.polla
Mon Jul 19 23:48:03 CEST 2010
On Sat, Jul 17, 2010 at 11:24 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Sat, Jul 17, 2010 at 04:50:10PM -0300, Ramiro Polla wrote:
>> swscale-test segfaults when built with 64-bit icc 11.1 (20100414). The
>> function that fails is hyscale_fast_MMX2(). Here's a disassembly of
>> the function:
>> ? ? a4b0: ? ? ? 53 ? ? ? ? ? ? ? ? ? ? ?push ? %rbx
>> ? ? a4b1: ? ? ? 48 8b 87 c8 30 00 00 ? ?mov ? ?0x30c8(%rdi),%rax
>> ? ? a4b8: ? ? ? 4c 8b 9f a8 30 00 00 ? ?mov ? ?0x30a8(%rdi),%r11
>> ? ? a4bf: ? ? ? 48 89 74 24 d8 ? ? ? ? ?mov ? ?%rsi,-0x28(%rsp)
>> ? ? a4c4: ? ? ? 45 89 ca ? ? ? ? ? ? ? ?mov ? ?%r9d,%r10d
>> ? ? a4c7: ? ? ? 48 89 54 24 e0 ? ? ? ? ?mov ? ?%rdx,-0x20(%rsp)
>> ? ? a4cc: ? ? ? 41 f7 da ? ? ? ? ? ? ? ?neg ? ?%r10d
>> ? ? a4cf: ? ? ? 83 bf 10 31 00 00 00 ? ?cmpl ? $0x0,0x3110(%rdi)
>> ? ? a4d6: ? ? ? 48 89 4c 24 e8 ? ? ? ? ?mov ? ?%rcx,-0x18(%rsp)
>> ? ? a4db: ? ? ? 48 89 44 24 d0 ? ? ? ? ?mov ? ?%rax,-0x30(%rsp)
>> ? ? a4e0: ? ? ? 48 8b 87 00 31 00 00 ? ?mov ? ?0x3100(%rdi),%rax
>> ? ? a4e7: ? ? ? 4c 89 5c 24 f0 ? ? ? ? ?mov ? ?%r11,-0x10(%rsp)
>> ? ? a4ec: ? ? ? 48 89 44 24 f8 ? ? ? ? ?mov ? ?%rax,-0x8(%rsp)
>> ? ? a4f1: ? ? ? 0f 84 05 01 00 00 ? ? ? je ? ? a5fc <hyscale_fast_MMX2+0x14c>
>> ? ? a4f7: ? ? ? 0f ef ff ? ? ? ? ? ? ? ?pxor ? %mm7,%mm7
>> ? ? a4fa: ? ? ? 48 8b 4c 24 e8 ? ? ? ? ?mov ? ?-0x18(%rsp),%rcx
>> ? ? a4ff: ? ? ? 48 8b 7c 24 d8 ? ? ? ? ?mov ? ?-0x28(%rsp),%rdi
>> ? ? a504: ? ? ? 48 8b 54 24 f0 ? ? ? ? ?mov ? ?-0x10(%rsp),%rdx
>> ? ? a509: ? ? ? 48 8b 5c 24 d0 ? ? ? ? ?mov ? ?-0x30(%rsp),%rbx
>> ? ? a50e: ? ? ? 48 31 c0 ? ? ? ? ? ? ? ?xor ? ?%rax,%rax
>> ? ? a511: ? ? ? 0f 18 01 ? ? ? ? ? ? ? ?prefetchnta (%rcx)
>> ? ? a514: ? ? ? 0f 18 41 20 ? ? ? ? ? ? prefetchnta 0x20(%rcx)
>> ? ? a518: ? ? ? 0f 18 41 40 ? ? ? ? ? ? prefetchnta 0x40(%rcx)
>> ? ? a51c: ? ? ? 8b 33 ? ? ? ? ? ? ? ? ? mov ? ?(%rbx),%esi
>> ? ? a51e: ? ? ? ff 54 24 f8 ? ? ? ? ? ? callq ?*-0x8(%rsp)
>> ? ? a522: ? ? ? 8b 34 03 ? ? ? ? ? ? ? ?mov ? ?(%rbx,%rax,1),%esi
>> ? ? a525: ? ? ? 48 01 f1 ? ? ? ? ? ? ? ?add ? ?%rsi,%rcx
>> ? ? a528: ? ? ? 48 01 c7 ? ? ? ? ? ? ? ?add ? ?%rax,%rdi
>> ? ? a52b: ? ? ? 48 31 c0 ? ? ? ? ? ? ? ?xor ? ?%rax,%rax
>> ? ? a52e: ? ? ? 8b 33 ? ? ? ? ? ? ? ? ? mov ? ?(%rbx),%esi
>> ? ? a530: ? ? ? ff 54 24 f8 ? ? ? ? ? ? callq ?*-0x8(%rsp)
>> [...]
>>
>> Since no functions are being called in C inside hyscale_fast_MMX2(),
>> the compiler decides it's ok to use -0x8(%rsp) instead of properly
>> sub'ing rsp, as it supposedly won't get overwritten. But in this case
>> we call the mmx2 code inside asm, overwriting -0x8(%rsp). The second
>> callq goes to a522, and when run again, it tries to run some random
>> code that was the next pointer on the stack. gcc does the same thing,
>> but it seems it leaves -0x8(%rsp) alone and uses the stack -0x10(%rsp)
>> and below.
>>
>> Is this a compiler bug (as in should it detect a call inside asm)?
>> Could (or should) we hint to the compiler that a call is being made
>> inside the asm block (I don't even know if this is possible)?
>
> I would suggest that you ask intel (after checking the manual).
> its surely possible to workaround this in various ways but this
> feels unclean.
I was able to reproduce the bug with gcc but they haven't been very
helpful nor have they acknowledged it:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44975
I faintly remember some intel compiler dev on this list or something
similar, maybe he could comment on the issue?
Otherwise I think the best solution is to switch to yasm like mans
suggested, but that would involve work and I'm not volunteering =).
Ramiro Polla
More information about the ffmpeg-devel
mailing list