[FFmpeg-devel] [PATCH] avcodec/alac: also use a temp buffer for 24bit samples
James Almer
jamrial at gmail.com
Tue Oct 6 22:17:43 CEST 2015
On 10/6/2015 4:40 PM, Paul B Mahol wrote:
> On 10/6/15, James Almer <jamrial at gmail.com> wrote:
>> Since AVFrame.extended_data is apparently not padded, simd functions
>> could in some cases overread, so make the decoder use a temp buffer
>> unconditionally.
>>
>> Signed-off-by: James Almer <jamrial at gmail.com>
>> ---
>> libavcodec/alac.c | 18 +++++-------------
>> 1 file changed, 5 insertions(+), 13 deletions(-)
>>
>> diff --git a/libavcodec/alac.c b/libavcodec/alac.c
>> index 146668e..394bd19 100644
>> --- a/libavcodec/alac.c
>> +++ b/libavcodec/alac.c
>> @@ -80,7 +80,6 @@ typedef struct ALACContext {
>> int extra_bits; /**< number of extra bits beyond 16-bit */
>> int nb_samples; /**< number of samples in the current frame */
>>
>> - int direct_output;
>> int extra_bit_bug;
>>
>> ALACDSPContext dsp;
>> @@ -278,10 +277,6 @@ static int decode_element(AVCodecContext *avctx,
>> AVFrame *frame, int ch_index,
>> return AVERROR_INVALIDDATA;
>> }
>> alac->nb_samples = output_samples;
>> - if (alac->direct_output) {
>> - for (ch = 0; ch < channels; ch++)
>> - alac->output_samples_buffer[ch] = (int32_t
>> *)frame->extended_data[ch_index + ch];
>> - }
>>
>> if (is_compressed) {
>> int16_t lpc_coefs[2][32];
>> @@ -393,8 +388,9 @@ static int decode_element(AVCodecContext *avctx, AVFrame
>> *frame, int ch_index,
>> break;
>> case 24: {
>> for (ch = 0; ch < channels; ch++) {
>> + int32_t *outbuffer = (int32_t *)frame->extended_data[ch_index +
>> ch];
>> for (i = 0; i < alac->nb_samples; i++)
>> - alac->output_samples_buffer[ch][i] <<= 8;
>> + *outbuffer++ = alac->output_samples_buffer[ch][i] << 8;
>> }}
>> break;
>> }
>> @@ -468,8 +464,7 @@ static av_cold int alac_decode_close(AVCodecContext
>> *avctx)
>> int ch;
>> for (ch = 0; ch < FFMIN(alac->channels, 2); ch++) {
>> av_freep(&alac->predict_error_buffer[ch]);
>> - if (!alac->direct_output)
>> - av_freep(&alac->output_samples_buffer[ch]);
>> + av_freep(&alac->output_samples_buffer[ch]);
>> av_freep(&alac->extra_bits_buffer[ch]);
>> }
>>
>> @@ -491,11 +486,8 @@ static int allocate_buffers(ALACContext *alac)
>> FF_ALLOC_OR_GOTO(alac->avctx, alac->predict_error_buffer[ch],
>> buf_size, buf_alloc_fail);
>>
>> - alac->direct_output = alac->sample_size > 16;
>> - if (!alac->direct_output) {
>> - FF_ALLOC_OR_GOTO(alac->avctx, alac->output_samples_buffer[ch],
>> - buf_size, buf_alloc_fail);
>> - }
>> + FF_ALLOC_OR_GOTO(alac->avctx, alac->output_samples_buffer[ch],
>> + buf_size, buf_alloc_fail);
>>
>> FF_ALLOC_OR_GOTO(alac->avctx, alac->extra_bits_buffer[ch],
>> buf_size, buf_alloc_fail);
>> --
>> 2.5.2
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel at ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>
> it should be padded and not introduce slowdown
If you mean the temp buffers, they will be padded alongside the simd functions
once i commit them.
But If you mean the avframe.extended_data buffer, could you take care of that?
I'm not familiar enough with avframe to change the relevant alloc functions.
running "time ffmpeg -v 0 -threads 1 -i INPUT -threads 1 -f null -" (implicit
pcm_s16le output)
Before
real 0m0.596s
user 0m0.000s
sys 0m0.000s
After
real 0m0.575s
user 0m0.000s
sys 0m0.000s
running "time ffmpeg -v 0 -threads 1 -i INPUT -threads 1 -c:a pcm_s24le -f null -"
Before
real 0m0.618s
user 0m0.000s
sys 0m0.000s
After
real 0m0.618s
user 0m0.000s
sys 0m0.000s
With a ~1 minute 24 bit 44.1kh stereo sample. Curious that it's faster when the
output is s16.
You'll probably have to do the same for the tak decoder before you commit your
decorrelate simd patch, btw. It also uses avframe.extended_data buffer directly
for 24bit samples.
More information about the ffmpeg-devel
mailing list