[FFmpeg-devel] [PATCH 8/9] x86: simple_idct: 12bits versions
Michael Niedermayer
michael at niedermayer.cc
Tue Oct 13 16:03:09 CEST 2015
On Mon, Oct 12, 2015 at 07:37:49PM +0200, Christophe Gisquet wrote:
> On 12 frames of a 444p 12 bits DNxHR sequence, _put function:
> C: 78902 decicycles in idct, 262071 runs, 73 skips
> avx: 32478 decicycles in idct, 262045 runs, 99 skips
>
> Difference between the 2:
> stddev: 0.39 PSNR:104.47 MAXDIFF: 2
>
> This is unavoidable and due to the scale factors used in the x86
> version, which cannot match the C ones.
>
> In addition, the trick of adding an initial bias to the input of a
> pass can overflow, as the input coefficients are already 15bits,
> which is the maximum this function can handle.
>
> Overall, however, the omse on 12 bits samples goes from 0.16916 to
> 0.16883. Reducing rowshift by 1 improves to 0.0908, but causes
> overflows.
> ---
> libavcodec/x86/idctdsp_init.c | 22 ++++++++++++++++++++--
> libavcodec/x86/simple_idct.h | 6 ++++++
> libavcodec/x86/simple_idct10.asm | 16 ++++++++++++++++
> 3 files changed, 42 insertions(+), 2 deletions(-)
applied
thanks
[..]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
You can kill me, but you cannot change the truth.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151013/ec5f3ef2/attachment.sig>
More information about the ffmpeg-devel
mailing list