[FFmpeg-devel] [PATCH 5/9] x86: simple_idct10_template: fix overflow in pass
Michael Niedermayer
michael at niedermayer.cc
Tue Oct 13 13:10:28 CEST 2015
On Tue, Oct 13, 2015 at 09:01:44AM +0200, Christophe Gisquet wrote:
> Hi,
>
> 2015-10-13 2:26 GMT+02:00 Michael Niedermayer <michael at niedermayer.cc>:
> > On Mon, Oct 12, 2015 at 07:37:46PM +0200, Christophe Gisquet wrote:
> >> When the input of a pass has 15 or 16 bits of precision (in particular
> >> the column pass), the addition of a bias to W4 may lead to overflows
> >> in the input to pmaddwd.
> >>
> >> This requires postponing the adding of the bias to after the first
> >> butterfly. To do so, the fact that m15, unused although zeroed, is
> >> exploited. In case the pass is safe, an address can be directly used,
> >> and the number of xmm regs can be decreased. Otherwise, the 32bits bias
> >> is loaded into it.
> >> ---
> >> libavcodec/x86/proresdsp.asm | 8 ++++----
> >> libavcodec/x86/simple_idct10_template.asm | 13 ++++++++++++-
> >> 2 files changed, 16 insertions(+), 5 deletions(-)
> >
> > how can i reproduce these overflows ?
>
> Generate the vsynth3-dnxhd-1080i-10bit.mov added after another patch.
>
> Decode it first using faani (you could miss the error).
>
> Now, for the parameters that fail. You know how
> (1<<(%pass_bitdepth-1))/W4 is added to the first butterfly. The macro
> allows to pass the right pw_ to it (essentially times 4 dw
> 1<<(%pass_bitdepth-1-14)), or "" and expects to find a
> pd_round_%pass_bitdepth (essentially times 4 dd
> 1<<(%pass_bitdepth-1)). This is indicated in the comments of the
> template: "Adding 1<<(%2-1) for >=15 bits values".
hmm, iam a bit concerned that adding the rounder (which effectively is
0.5) causes a overflow, that would if iam not mistaken imlpy that
things are very close to overflowing already without it
but either way this patch is needed for the 10bit IDCT code
so applied
thanks
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I do not agree with what you have to say, but I'll defend to the death your
right to say it. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151013/af6f89f3/attachment.sig>
More information about the ffmpeg-devel
mailing list