[FFmpeg-devel] [PATCH] SPARC VIS simple_idct
Michel Lespinasse
walken
Sat Aug 25 08:02:29 CEST 2007
On Sat, Aug 25, 2007 at 01:20:45AM +0200, Balatoni Denes wrote:
> ps: there is a half as fast version of this idct, but that's
> accurate (32 bit multiplies) - I am wondering if maybe that would
> make more sense in ffmpeg. Or there could be three sparc idcts: one
> slow (but faster than the C version) and accurate, one faster but
> less accurate, and then the mlib (fastest, very inaccurate, mpeg4
> routinely turns pink while viewing). Maybe it's not a good idea.
One can make an accurate enough IDCT using VIS, the 8x16 bit multiplies
make that very hard but not quite entirely impossible.
I finally dug out the old test code I had written in september 2003,
at the time david miller was interested to convert that to asm and then
he got busy with other things. It works on columns, then transposes
the table and does another pass. David thought he could make it faster
than the VIS one, but hey, talk is cheap :) The test code is in C but
I believe the muls/mulu functions match what VIS implements. idct() is
what you'd want to be fast, idct_vis() is the C function I used to hook
this into my IEEE1180 test framework, it reorders input coefficients
and preshifts them by 4 (libmpeg2 prescales IDCT during stream parsing).
Don't know if there is any interest or how it compares with the simple-idct
derived code - but, it is (barely) IEEE1180 compliant and does not use
32-bit multiplies.
Hope this helps,
--
Michel "Walken" Lespinasse
"Bill Gates is a monocle and a Persian cat away from being the villain
in a James Bond movie." -- Dennis Miller
-------------- next part --------------
A non-text attachment was scrubbed...
Name: idct_vis.c
Type: text/x-csrc
Size: 6416 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070824/70fe61dc/attachment.c>
More information about the ffmpeg-devel
mailing list