[FFmpeg-devel] [PATCH] SSE RDFT
Jason Garrett-Glaser
darkshikari
Mon Mar 15 01:53:14 CET 2010
On Sun, Mar 14, 2010 at 3:23 PM, Alex Converse <alex.converse at gmail.com> wrote:
> I'm sure I've made some embarrassingly amateurish mistakes here.
> Feedback is more than welcome.
>
> --Alex
In the interests of getting away from discussions about yasm and into
actually reviewing the asm...
+///sign mask of RDFT sine terms
Three / ?
Looking at the asm overall, it looks like there's a huge amount of
moving stuff around and very little actual calculation. Is there no
better way to organize it?
+ "movlps (%4,%0,4), %%xmm4 \n\t"
+ "unpcklps %%xmm4, %%xmm4 \n\t"
+ "movlps (%5,%0,4), %%xmm3 \n\t"
+ "unpcklps %%xmm3, %%xmm3 \n\t"
This looks like a candidate for movsldup in an SSE3 version.
Dark Shikari
More information about the ffmpeg-devel
mailing list