[FFmpeg-devel] [PATCH 8/9] x86: simple_idct: 12bits versions

Tue Oct 13 16:03:09 CEST 2015

On Mon, Oct 12, 2015 at 07:37:49PM +0200, Christophe Gisquet wrote:
> On 12 frames of a 444p 12 bits DNxHR sequence, _put function:
> C:         78902 decicycles in idct,  262071 runs,     73 skips
> avx:       32478 decicycles in idct,  262045 runs,     99 skips
> 
> Difference between the 2:
> stddev:    0.39 PSNR:104.47 MAXDIFF:    2
> 
> This is unavoidable and due to the scale factors used in the x86
> version, which cannot match the C ones.
> 
> In addition, the trick of adding an initial bias to the input of a
> pass can overflow, as the input coefficients are already 15bits,
> which is the maximum this function can handle.
> 
> Overall, however, the omse on 12 bits samples goes from 0.16916 to
> 0.16883. Reducing rowshift by 1 improves to 0.0908, but causes
> overflows.
> ---
>  libavcodec/x86/idctdsp_init.c    | 22 ++++++++++++++++++++--
>  libavcodec/x86/simple_idct.h     |  6 ++++++
>  libavcodec/x86/simple_idct10.asm | 16 ++++++++++++++++
>  3 files changed, 42 insertions(+), 2 deletions(-)

applied

thanks

[..]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

You can kill me, but you cannot change the truth.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151013/ec5f3ef2/attachment.sig>