[FFmpeg-devel] Patch: Inline asm fixes for Intel compiler on Windows
Michael Niedermayer
michaelni at gmx.at
Sun Apr 6 15:29:39 CEST 2014
On Sun, Apr 06, 2014 at 12:21:54AM +1100, Matt Oliver wrote:
> > If you do not initialize "bit", making it an output-only paramenter,
> > you should use =&q instead of +q as constraint.
> > However neither is correct as-is, setae sets only a single byte,
> > thus not initializing it will result in uninitialized data,
> > IOW the code as is is broken.
> > If you want to do that kind of change, you would have to use uint8_t
> > instead as type.
>
>
> Interestingly both compilers I tested with didnt show any errors and had
> the upper bits zeroed. However as the function returns a 32b int and any
> code that calls this function would also assume 32b then a xor to set to 0
> initially is still probably the way to go. Using uint8 would save that
> instruction but would add one later to clear the upper bytes should it ever
> be used for anything larger than 1B. So fixed to add back the set to 0.
> vp56_arith.h | 13 +++++--------
> 1 file changed, 5 insertions(+), 8 deletions(-)
> 7a259454956d08f2c3e8375715c253a3327780d4 Remove-leal-op-to-fix-icl-inline-asm.patch
> From 995c066d7b48d2a6603bb46649badba74e757dd5 Mon Sep 17 00:00:00 2001
> From: Matt Oliver <protogonoi at gmail.com>
> Date: Sun, 6 Apr 2014 00:11:21 +1100
> Subject: [PATCH] Remove leal op to fix icl inline asm.
>
> ---
> libavcodec/x86/vp56_arith.h | 13 +++++--------
> 1 file changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/libavcodec/x86/vp56_arith.h b/libavcodec/x86/vp56_arith.h
> index e71dbf8..6cc97dd 100644
> --- a/libavcodec/x86/vp56_arith.h
> +++ b/libavcodec/x86/vp56_arith.h
> @@ -29,24 +29,21 @@
> static av_always_inline int vp56_rac_get_prob(VP56RangeCoder *c, uint8_t prob)
> {
> unsigned int code_word = vp56_rac_renorm(c);
> - unsigned int high = c->high;
> - unsigned int low = 1 + (((high - 1) * prob) >> 8);
> + unsigned int low = 1 + (((c->high - 1) * prob) >> 8);
> unsigned int low_shift = low << 16;
> int bit = 0;
> + c->code_word = code_word;
>
> __asm__(
> "subl %4, %1 \n\t"
> "subl %3, %2 \n\t"
> - "leal (%2, %3), %3 \n\t"
> "setae %b0 \n\t"
> "cmovb %4, %1 \n\t"
> - "cmovb %3, %2 \n\t"
> - : "+q"(bit), "+r"(high), "+r"(code_word), "+r"(low_shift)
> - : "r"(low)
> + "cmovb %5, %2 \n\t"
> + : "+q"(bit), "+r"(c->high), "+r"(c->code_word)
> + : "r"(low_shift), "r"(low), "r"(code_word)
> );
>
> - c->high = high;
> - c->code_word = code_word;
this patch breaks fate-vp6a-skip_alpha, fate-vp6f and fate-vp6a
possibly also others, didnt test with -k
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Everything should be made as simple as possible, but not simpler.
-- Albert Einstein
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140406/e9d9163a/attachment.asc>
More information about the ffmpeg-devel
mailing list