[FFmpeg-devel] [PATCH 2/3] Indeo 5 decoder: common DSP functions
Kostya
kostya.shishkov
Sun Jan 17 15:58:23 CET 2010
On Sun, Jan 17, 2010 at 02:35:16PM +0100, Michael Niedermayer wrote:
> On Sun, Jan 17, 2010 at 03:06:40PM +0200, Kostya wrote:
> > On Sun, Jan 10, 2010 at 01:22:17PM +0200, Kostya wrote:
> > > On Sat, Jan 09, 2010 at 05:43:40PM +0200, Kostya wrote:
> > > > On Sat, Jan 09, 2010 at 03:47:39PM +0100, Michael Niedermayer wrote:
> > > > > On Sat, Jan 09, 2010 at 04:40:30PM +0200, Kostya wrote:
> > > > > > On Fri, Jan 08, 2010 at 11:41:23PM +0100, Michael Niedermayer wrote:
> > > > > > > On Sun, Jan 03, 2010 at 12:56:36PM +0200, Kostya wrote:
> > > > > > > [...]
> > > > > > > > void ff_ivi_recompose53(const IVIPlaneDesc *plane, uint8_t *dst,
> > > > > > [function body skipped]
> > > > > > >
> > > > > > > is this mess faster than some more readable variant?
> > > > > >
> > > > > > Here's more readable variant by me, checked to be bitexact but it's
> > > > > > significantly slower (> 10%), I'd rather leave old one.
> > > > >
> > > > > I also prefer speed, what about an implementation using lifting?
> > > >
> > > > I'll try to implement it.
> > >
> > > Hmm, after some experiments I'd rather leave original version.
> > > Even grouping variables together in array gives significant performance
> > > drop. And pure lifting transform is not applicable here either because
> > > band data is grouped and it will take at least two passes (hor/vert)
> > > with conditions for missing bands and requires an additional temp
> > > buffer.
> >
> > I've tried reusing Snow wavelet composing there. It was several percents
> > slower which is not surprising because it needs an intermediate buffer
> > there and Indeo5 code does not need to check for odd dimensions.
>
> the intermediate buffer is avoidable, it can be done as part of the transform
> between horizontal & vertical transform.
> is it faster without that transform?
It's not avoidable - this scaling cannot modify input since it's used
for further decoding and output is uint8_t, so it is simply not enough
for holding intermediate values.
In theory it should be faster, but unfortunately is not straight
applicable here.
> [...]
> --
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
More information about the ffmpeg-devel
mailing list