[FFmpeg-devel] [PATCH] lavc/x86/videodsp: Fix clobbered FPU state on x86-32

Mon Nov 11 11:02:57 EET 2024

On 10/11/2024 23:57, James Almer wrote:
> On 11/10/2024 3:38 PM, Frank Plowman wrote:
>> These assembly optimisations can use MMX.  They failed to reset the
>> floating-point state when they are done, hence subsequent floating-point
>> operations return nonsense values.
>>
>> This fixes the FATE failure for vvc-output-ref on x86-32, e.g.
>> https://fate.ffmpeg.org/report.cgi?slot=x86_32-uubuntu-mingw32-
>> gcc&time=20241110053421
>>
>> Signed-off-by: Frank Plowman <post at frankplowman.com>
>> ---
>>   libavcodec/x86/videodsp.asm | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/libavcodec/x86/videodsp.asm b/libavcodec/x86/videodsp.asm
>> index 3cc07878d3..6144f13fca 100644
>> --- a/libavcodec/x86/videodsp.asm
>> +++ b/libavcodec/x86/videodsp.asm
>> @@ -313,6 +313,7 @@ cglobal emu_edge_vfix %+ %%n, 1, 5, 1, dst, src,
>> start_y, end_y, bh
>>       jnz .bottom_loop                            ; }
>>     .end:
>> +    emms
> 
> This needs to be added only for the MMX version, not the SSE one, so
> wrap it in a %if mmsize == 8 check.
> 
I thought the same before submitting this patch, and tried only adding
the line conditionally based on the mmsize, but found neither wrapping
it in an a) mmsize == 8, nor b) mmsize != 8 block alone worked.
Seemingly the line cannot be run conditionally based on mmsize.  I'm not
quite sure of the mechanism behind this, there's a lot of preprocessor
macros to untangle in {READ,WRITE}_NUM_BYTES.

Cheers,
Frank