[FFmpeg-devel] [PATCH] avcodec/x86/hevc: fix luma 12b overflow

Ronald S. Bultje rsbultje at gmail.com
Mon Feb 26 01:00:10 EET 2024


Hi,

On Sun, Feb 25, 2024 at 5:30 PM Henrik Gramner via ffmpeg-devel <
ffmpeg-devel at ffmpeg.org> wrote:

> On Sun, Feb 25, 2024 at 5:42 PM Ronald S. Bultje <rsbultje at gmail.com>
> wrote:
> > +    mova            m13, [pw_8]
> > +    paddw           m10, m12, m12
> > +    paddw           m12, m10 ; 9 * (q0 - p0) - 3 * ( q1 - p1 )
> >      paddw           m12, m13; + 8
>
> Memory operand
>
> > +    paddw           m10, m13, m13
> > +    paddw           m13, m10 ; abs(9 * (q0 - p0) - 3 * ( q1 - p1 ))
> > +    paddw           m13, [pw_8]
> [...]
> > +    paddw           m13, m12, m12
> > +    paddw           m13, m12 ; 3*abs(m12)
> > +    paddw           m13, [pw_8]
>
> Another minor improvement would be to reorder the adds like (x + x) +
> (x + 8) instead of ((x + x) + x) + 8 to allow for more
> instruction-level parallelism.
>

New version attached.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-hevc-x86-deblock-fix-12bit-overflow.patch
Type: application/octet-stream
Size: 2299 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240225/ad931d9f/attachment.obj>


More information about the ffmpeg-devel mailing list