[FFmpeg-devel] [PATCH 10/12] WMAPRO: use vector_clipf_interleave()
Måns Rullgård
mans
Tue Sep 29 22:52:29 CEST 2009
Sascha Sommer <saschasommer at freenet.de> writes:
> Hi,
>
> On Sonntag, 27. September 2009, Mans Rullgard wrote:
>> ---
>> libavcodec/wmaprodec.c | 20 ++++++++------------
>> 1 files changed, 8 insertions(+), 12 deletions(-)
>>
>> diff --git a/libavcodec/wmaprodec.c b/libavcodec/wmaprodec.c
>> index a489047..ac559a4 100644
>> --- a/libavcodec/wmaprodec.c
>> +++ b/libavcodec/wmaprodec.c
>> @@ -221,6 +221,7 @@ typedef struct WMAProDecodeCtx {
>> WMAProChannelGrp chgroup[WMAPRO_MAX_CHANNELS]; ///< channel group
>> information
>>
>> WMAProChannelCtx channel[WMAPRO_MAX_CHANNELS]; ///< per channel data
>> + const float *channel_ptr[WMAPRO_MAX_CHANNELS];
>
> In other places the star follows directly after the data type.
> const float* channel_ptr.
True. I don't like that style as it is highly misleading, but I'll
change it to maintain consistency.
> Also a doxygen comment could be added.
Yes, it could...
>> } WMAProDecodeCtx;
>>
>>
>> @@ -443,6 +444,9 @@ static av_cold int decode_init(AVCodecContext *avctx)
>> for (i = 0; i < 33; i++)
>> sin64[i] = sin(i*M_PI / 64.0);
>>
>> + for (i = 0; i < WMAPRO_MAX_CHANNELS; i++)
>> + s->channel_ptr[i] = s->channel[i].out;
>> +
>> if (avctx->debug & FF_DEBUG_BITSTREAM)
>> dump_context(s);
>>
>> @@ -1331,19 +1335,11 @@ static int decode_frame(WMAProDecodeCtx *s)
>> }
>>
>> /** interleave samples and write them to the output buffer */
>> - for (i = 0; i < s->num_channels; i++) {
>> - float* ptr;
>> - int incr = s->num_channels;
>> - float* iptr = s->channel[i].out;
>> - int x;
>> -
>> - ptr = s->samples + i;
>> -
>> - for (x = 0; x < s->samples_per_frame; x++) {
>> - *ptr = av_clipf(*iptr++, -1.0, 32767.0 / 32768.0);
>> - ptr += incr;
>> - }
>> + s->dsp.vector_clipf_interleave(s->samples, s->channel_ptr,
>> + -1.0, 32767.0 / 32768.0,
>> + s->samples_per_frame, s->num_channels);
>>
>> + for (i = 0; i < s->num_channels; i++) {
>> /** reuse second half of the IMDCT output for the next frame */
>> memcpy(&s->channel[i].out[0],
>> &s->channel[i].out[s->samples_per_frame],
>
> Ok.
Assuming the addition of this function to dsputil is ok, that is.
> P.S: If you want a faster decoder you can try to output int16_t again.
> I don't know if such a patch would be acceptable for ffmpeg, however.
The trend seems to be to have floating-point output directly. Speed
could be improved by optimising audioconvert.c, which is presently
totally devoid of any SIMD. Decoding wmapro to int16 on Cortex-A8
spends, after my patches, 40% of the time there.
> Also the decoder currently always copies the frame data to a tmp buffer to
> avoid problems with damaged streams that might cause overreads.
> This copy should only be needed when a frame crosses a packet boundary.
This should probably be done, although the benefit will be smaller
since it's a simple copy of compressed data. The huge speedups I've
been getting with these patches are due to gcc being exceptionally bad
at floating-point maths.
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list