[FFmpeg-devel] [PATCH] av_filter/x86/idet: MMX/SSE2 implementation of 16bits filter_line()
Pascal Massimino
pascal.massimino at gmail.com
Tue Sep 9 21:58:31 CEST 2014
James,
On Tue, Sep 9, 2014 at 10:31 AM, James Almer <jamrial at gmail.com> wrote:
> On 09/09/14 9:52 AM, Pascal Massimino wrote:
> > + mova m2, m_sum
> > +%if mmsize == 16
> > + psrldq m2, 4
> > + paddd m_sum, m2
> > + psrldq m2, 4
> > + paddd m_sum, m2
> > + psrldq m2, 4
> > + paddd m_sum, m2
> > +%else
> > + psrlq m2, 32
> > + paddd m_sum, m2
> > +%endif
>
> The SSE2 version is using three instructions more than necessary here.
> You could use the HADDD macro to replace the code above, which expands
> to a more optimized SSE2 version.
>
> And now that i check the old stuff again, you could also use it in the
> IDET_FILTER_LINE macro. It will be one less instruction for the mmxext
> version.
>
oh, right! let me send you a patch for that...
More information about the ffmpeg-devel
mailing list