[MPlayer-dev-eng] MS-ADPCM/Stereo Works
Michael Niedermayer
michaelni at gmx.at
Fri Dec 28 04:05:07 CET 2001
Hi
On Friday 28 December 2001 03:36, Mike Melanson wrote:
> On Fri, 28 Dec 2001, Michael Niedermayer wrote:
> > > but they work on small, fixed-size fragments (blocks, chunks), don't?
> > > so they could be decoded in parallel.
> >
> > i thought about the same ... interleave a few blocks so that mmx has only
> > independant stuff ...
>
> I hadn't thought of this. True, in the IMA variant, blocks are
> only 32 bytes long, though there are still 64 samples to be decoded. But
> they could be interleaved in such a way as to parallelize the decode of 4
> separate blocks. But that's just for IMA. MS ADPCM has 256-byte blocks and
> the decoder I'm working on tonight has 2048-byte blocks. Realistically,
> though, I doubt that the overhead of (de-)interleaving would defeat the
> purpose.
interleaving bytes is pretty speedy with mmx (only 1 instruction to
interleave 2x4 bytes into 8)
deinterleaving is not that fast but still fast (3 instruction if i counted
correctly to reverse it)
well ... adpcm works on 4-bit stuff ... that shouldnt be that difficult either
>
> Still, it's an interesting thought experiment, just to understand
> SIMD instructions.
perhaps it is even possible to do SIMD sith ANSI C here :)
all variables seem to fit into at least 16 bit so it might be possible to
handle 2 at once (from 2 blocks or 2 channels ... they are indepedant?)
and the blahblah[channel] thing isnt the fastest either (unneccesary array)
unrolling would speed it up perhaps if unrolling is possible at all here ...
cool, we r trying to optimize a adpcm decoder wich needs perhaps 1% of the
cpu time
Michael
More information about the MPlayer-dev-eng
mailing list