[FFmpeg-devel] [PATCH 2/5] lavfi/gradfun: fix rounding in MMX code.
Clément Bœsch
ubitux at gmail.com
Tue Dec 18 02:31:24 CET 2012
On Fri, Dec 07, 2012 at 11:49:40PM +0000, Loren Merritt wrote:
> On Fri, 7 Dec 2012, Reimar Döffinger wrote:
> > On 7 Dec 2012, at 16:28, Loren Merritt <lorenm at u.washington.edu> wrote:
> >> On Fri, 7 Dec 2012, Reimar Döffinger wrote:
> >>> "Clément Bÿÿsch" <ubitux at gmail.com> wrote:
> >>>
> >>>> Current code divide before increasing precision.
> >>>
> >>> Are you sure the shift can _never_ overflow? It is definitely very tight.
> >>
> >> I expect it can overflow, but the case where it does so also has m=0 for
> >> sane filter strengths. So the appropriate fix would be to limit strength,
> >> not change the asm.
> >
> > Since the proposal is to change the asm, do you mean the patch is right, it just should be combined with limiting the strength?
>
> Yes.
>
> > I have no objections about that, I just kind of assumed there was a good reason for doing things in this order.
>
> The original order probably was about avoiding overflow, but limiting
> strength is better.
>
Just to make sure I understand the issue correctly:
In the ASM, delta is expressed as u16, so delta<<2 will overflow for
values > 0x3fff. In these cases we want that m, defined by:
int m = abs(delta) * thresh >> 16; (1)
m = FFMAX(0, 127 - m);
...to be 0, so whatever the value of delta, the expression:
m = m * m * delta >> 14;
...will always be 0.
To do so, we just need to make sure that abs(delta) * thresh >> 16 to be
always >= 127.
We currently have the threshold in the range from (1<<15)/0.51 to
(1<<15)/255, with [0.51;255] being the range the user can specify. The
higher the user specified threshold, the closer we get to 127 (from +oo).
We need to solve:
delta * ((1<<15)/x) >> 16 >= 127 (based on (1)),
with delta > 0x3fff and x >= 0.51
=> delta * (1<<15)/x / (1<<16) >= 127
delta / (2x) >= 127
so we have 0.51 <= x <= 2 * delta / 127
The smallest overflowing delta possible is 0x3fff, so x must be <=
2*0x3fff/127 = 258.
258 looks in the current [0.51;255] user range.
What am I missing?
Note: it's highly probable I failed at math again.
--
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20121218/e360e281/attachment.asc>
More information about the ffmpeg-devel
mailing list