[FFmpeg-devel] [PATCH] avcodec/alac: also use a temp buffer for 24bit samples
Paul B Mahol
onemda at gmail.com
Tue Oct 6 22:33:42 CEST 2015
On 10/6/15, James Almer <jamrial at gmail.com> wrote:
> On 10/6/2015 4:40 PM, Paul B Mahol wrote:
>> On 10/6/15, James Almer <jamrial at gmail.com> wrote:
>>> Since AVFrame.extended_data is apparently not padded, simd functions
>>> could in some cases overread, so make the decoder use a temp buffer
>>> unconditionally.
>>>
>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>> ---
>>> libavcodec/alac.c | 18 +++++-------------
>>> 1 file changed, 5 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/libavcodec/alac.c b/libavcodec/alac.c
>>> index 146668e..394bd19 100644
>>> --- a/libavcodec/alac.c
>>> +++ b/libavcodec/alac.c
>>> @@ -80,7 +80,6 @@ typedef struct ALACContext {
>>> int extra_bits; /**< number of extra bits beyond 16-bit */
>>> int nb_samples; /**< number of samples in the current frame */
>>>
>>> - int direct_output;
>>> int extra_bit_bug;
>>>
>>> ALACDSPContext dsp;
>>> @@ -278,10 +277,6 @@ static int decode_element(AVCodecContext *avctx,
>>> AVFrame *frame, int ch_index,
>>> return AVERROR_INVALIDDATA;
>>> }
>>> alac->nb_samples = output_samples;
>>> - if (alac->direct_output) {
>>> - for (ch = 0; ch < channels; ch++)
>>> - alac->output_samples_buffer[ch] = (int32_t
>>> *)frame->extended_data[ch_index + ch];
>>> - }
>>>
>>> if (is_compressed) {
>>> int16_t lpc_coefs[2][32];
>>> @@ -393,8 +388,9 @@ static int decode_element(AVCodecContext *avctx,
>>> AVFrame
>>> *frame, int ch_index,
>>> break;
>>> case 24: {
>>> for (ch = 0; ch < channels; ch++) {
>>> + int32_t *outbuffer = (int32_t
>>> *)frame->extended_data[ch_index +
>>> ch];
>>> for (i = 0; i < alac->nb_samples; i++)
>>> - alac->output_samples_buffer[ch][i] <<= 8;
>>> + *outbuffer++ = alac->output_samples_buffer[ch][i] << 8;
>>> }}
>>> break;
>>> }
>>> @@ -468,8 +464,7 @@ static av_cold int alac_decode_close(AVCodecContext
>>> *avctx)
>>> int ch;
>>> for (ch = 0; ch < FFMIN(alac->channels, 2); ch++) {
>>> av_freep(&alac->predict_error_buffer[ch]);
>>> - if (!alac->direct_output)
>>> - av_freep(&alac->output_samples_buffer[ch]);
>>> + av_freep(&alac->output_samples_buffer[ch]);
>>> av_freep(&alac->extra_bits_buffer[ch]);
>>> }
>>>
>>> @@ -491,11 +486,8 @@ static int allocate_buffers(ALACContext *alac)
>>> FF_ALLOC_OR_GOTO(alac->avctx, alac->predict_error_buffer[ch],
>>> buf_size, buf_alloc_fail);
>>>
>>> - alac->direct_output = alac->sample_size > 16;
>>> - if (!alac->direct_output) {
>>> - FF_ALLOC_OR_GOTO(alac->avctx,
>>> alac->output_samples_buffer[ch],
>>> - buf_size, buf_alloc_fail);
>>> - }
>>> + FF_ALLOC_OR_GOTO(alac->avctx, alac->output_samples_buffer[ch],
>>> + buf_size, buf_alloc_fail);
>>>
>>> FF_ALLOC_OR_GOTO(alac->avctx, alac->extra_bits_buffer[ch],
>>> buf_size, buf_alloc_fail);
>>> --
>>> 2.5.2
>>>
>>> _______________________________________________
>>> ffmpeg-devel mailing list
>>> ffmpeg-devel at ffmpeg.org
>>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>
>>
>> it should be padded and not introduce slowdown
>
> If you mean the temp buffers, they will be padded alongside the simd
> functions
> once i commit them.
> But If you mean the avframe.extended_data buffer, could you take care of
> that?
> I'm not familiar enough with avframe to change the relevant alloc functions.
>
> running "time ffmpeg -v 0 -threads 1 -i INPUT -threads 1 -f null -"
> (implicit
> pcm_s16le output)
>
> Before
> real 0m0.596s
> user 0m0.000s
> sys 0m0.000s
>
> After
> real 0m0.575s
> user 0m0.000s
> sys 0m0.000s
>
>
> running "time ffmpeg -v 0 -threads 1 -i INPUT -threads 1 -c:a pcm_s24le -f
> null -"
>
> Before
> real 0m0.618s
> user 0m0.000s
> sys 0m0.000s
>
> After
> real 0m0.618s
> user 0m0.000s
> sys 0m0.000s
>
> With a ~1 minute 24 bit 44.1kh stereo sample. Curious that it's faster when
> the
> output is s16.
> You'll probably have to do the same for the tak decoder before you commit
> your
> decorrelate simd patch, btw. It also uses avframe.extended_data buffer
> directly
> for 24bit samples.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
you set aligned number of samples before calling get_buffer and after
that changes
frame->nb_samples to actual number of samples.
Alternatively IIRC default 16 byte alignment could be increased.
More information about the ffmpeg-devel
mailing list