[FFmpeg-devel] [PATCH] lavu/tx: make 32-bit fixed-point transforms more bitexact

Tue Jun 20 11:53:29 EEST 2023

On Tue, 20 Jun 2023, Lynne wrote:

> Using the sqrt/cos/sin approximations we have, the only parts left
> which may be inexact are multiplies and divisions in some transforms.

This seems to help somewhat, but there still are cases of inexactness, 
somewhere.

The content of the tables that are initialized here does become bitexact 
(at least across some of the configs that otherwise disagree with the 
output), but despite that, the output differs.

With the test references generated on linux/x86_64 compiled with GCC, run 
on an Intel CPU, I get the following set of machines that either agree or 
disagree with the reference:

matching
- linux x86_64 gcc11 Intel
- linux aarch64 gcc12 on Apple M1
- linux aarch64 clang10 Neoverse N1
- linux aarch64 gcc9 Neoverse N1
- linux armv7 gcc9 Neoverse N1

disagreeing
- macos x86_64 clang Xcode14 Intel
- mingw x86_64 clang trunk Dragonboard
- macos aarch64 clang Xcode12 Apple M1
- macos aarch64 clang Xcode14 Apple M1
- linux i686 gcc11 Intel
- mingw aarch64 clang trunk Dragonboard
- linux aarch64 gcc7 Dragonboard
- mingw armv7 clang trunk Dragonboard
- mingw i686 clang trunk Intel
- mingw i686 clang trunk -march=i686 Intel

The configs that are easiest to reproduce are probably the ones on macOS 
on Apple M1, or macOS on x86_64 if you happen to have access to that, or 
GCC/i686 on Linux (just configure with --extra-cflags=-m32 
--extra-ldflags=-m32).

// Martin