[FFmpeg-devel] [PATCH+RFC] AVFrame for audio

Wed Oct 27 03:31:13 CEST 2010

Michael Niedermayer wrote:

> On Sun, Oct 17, 2010 at 05:22:54PM -0400, Justin Ruggles wrote:
>> Michael Niedermayer wrote:
>>
>>> On Sat, Oct 16, 2010 at 04:12:26PM -0400, Justin Ruggles wrote:
>>>> Michael Niedermayer wrote:
>>>>
>>>>> On Fri, Oct 15, 2010 at 06:35:01PM -0400, Justin Ruggles wrote:
>>>>>> Justin Ruggles wrote:
>>>>>>
>>>>>>> Michael Niedermayer wrote:
>>>>>>>
>>>>>>>> On Wed, Oct 13, 2010 at 07:52:12PM -0400, Justin Ruggles wrote:
>>>>>>>>> Michael Niedermayer wrote:
>>>>>>>>>
>>>>>>>>>> On Wed, Oct 06, 2010 at 11:05:26AM -0400, Justin Ruggles wrote:
>>>>>>>>>>> Michael Niedermayer wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Oct 05, 2010 at 04:55:12PM -0400, Justin Ruggles wrote:
>>>>>>>>>>>>> Michael Niedermayer wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Sep 29, 2010 at 09:20:04PM -0400, Justin Ruggles wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Peter Ross wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Sep 02, 2010 at 07:11:37PM -0400, Justin Ruggles wrote:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>> [...]
>>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>>> @@ -644,29 +677,49 @@
>>>>>>>>>>>>>>>  }
>>>>>>>>>>>>>>>  #endif
>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>> +#if LIBAVCODEC_VERSION_MAJOR < 53
>>>>>>>>>>>>>>>  int attribute_align_arg avcodec_decode_audio3(AVCodecContext *avctx, int16_t *samples,
>>>>>>>>>>>>>>>                           int *frame_size_ptr,
>>>>>>>>>>>>>>>                           AVPacket *avpkt)
>>>>>>>>>>>>>>>  {
>>>>>>>>>>>>>>> +    AVFrame frame;
>>>>>>>>>>>>>>> +    int ret, got_frame = 0;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +    avcodec_get_frame_defaults(&frame);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +    ret = avcodec_decode_audio4(avctx, &frame, &got_frame, avpkt);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +    if (ret >= 0 && got_frame) {
>>>>>>>>>>>>>>> +        *frame_size_ptr = frame.nb_samples * avctx->channels *
>>>>>>>>>>>>>>> +                          (av_get_bits_per_sample_format(avctx->sample_fmt) / 8);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +        /* ensure data will fit in the output buffer */
>>>>>>>>>>>>>>> +        if (*frame_size_ptr > AVCODEC_MAX_AUDIO_FRAME_SIZE) {
>>>>>>>>>>>>>>> +            av_log(avctx, AV_LOG_WARNING, "avcodec_decode_audio3 samples "
>>>>>>>>>>>>>>> +                   "truncated to AVCODEC_MAX_AUDIO_FRAME_SIZE\n");
>>>>>>>>>>>>>>> +            *frame_size_ptr = AVCODEC_MAX_AUDIO_FRAME_SIZE;
>>>>>>>>>>>>>>> +        }
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +        memcpy(samples, frame.data[0], *frame_size_ptr);
>>>>>>>>>>>>>>  the default get_buffer() should return the appropriate
>>>>>>>>>>>>>> buffer for this case.
>>>>>>>>>>>>> I'm sorry, I don't understand your comment.
>>>>>>>>>>>> i mean (non functional psseudocode below to explain the idea)
>>>>>>>>>>>> avcodec_decode_audio3(){
>>>>>>>>>>>>     avctx->foobar= samples;
>>>>>>>>>>>>     ret = avcodec_decode_audio4(avctx, &frame, &got_frame, avpkt);
>>>>>>>>>>>>     ...
>>>>>>>>>>>>     assert(samples == frame.data[0]);
>>>>>>>>>>>>
>>>>>>>>>>>> -----
>>>>>>>>>>>> default_get_buffer(){
>>>>>>>>>>>>     if(avctx->foobar)
>>>>>>>>>>>>         return avctx->foobar;
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> (and yes this cannot work for all theoretical decoders)
>>>>>>>>>>> I think I get it.  So avctx->foobar would be an optional user-supplied
>>>>>>>>>>> buffer (avctx->user_buffer?) that default_get_buffer() would return if
>>>>>>>>>>> it is non-NULL, right?
>>>>>>>>>> yes
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> The problem I see is this:
>>>>>>>>>>> avcodec_decode_audio3() would use avcodec_decode_audio4().
>>>>>>>>>>> avcodec_decode_audio4() allocates as large a buffer as is needed through
>>>>>>>>>>> get_buffer(), but the avcodec_decode_audio3() API only requires the
>>>>>>>>>>> user-supplied buffer to be AVCODEC_MAX_AUDIO_FRAME_SIZE.  Couldn't this
>>>>>>>>>>> lead to the decoder writing past the end of a user-supplied buffer if it
>>>>>>>>>>> isn't large enough?  I guess we could also add a field
>>>>>>>>>>> avctx->user_buffer_size?
>>>>>>>>>> yes, of course
>>>>>>>>>> it was just a rough idea ...
>>>>>>>>> I'm running into some questions trying to implement the rough idea.  The
>>>>>>>>> only way I can see this working smoothly is if avcodec_decode_audio3()
>>>>>>>>> always sets get/release_buffer to default.  Also, either all audio
>>>>>>>>> decoders will have to support CODEC_CAP_DR1 (i.e. they always use
>>>>>>>>> get/release_buffer) or there needs to be a fallback that will memcpy
>>>>>>>>> into the user buffer if CODEC_CAP_DR1 is not supported.
>>>>>>>> old API decoders surely dont need to copy with old API.
>>>>>>>> old API decoders surely dont need to copy with new API if the api can provide
>>>>>>>> a buffer to the decoder (this can be through function argument like its done
>>>>>>>> currently)
>>>>>>>> new API decoders surely dont need to copy with new API because otherwise the
>>>>>>>> API sucks and needs more work
>>>>>>>> whats left is new API decoders and used with old API and for this get_buffer()
>>>>>>>> should return the user supplied buffer if its large enough and fail if its not
>>>>>>>> large enough.
>>>>>>>> The case where the user overrides get_buffer() and supplies a user specified
>>>>>>>> buffer which its own code doesnt use is a case that id consider user error.
>>>>>>> I think I might have been misinterpreting the API.  For video decoders,
>>>>>>> what does it mean as far as buffer allocation when CODEC_CAP_DR1 is not set?
>>>>>> So I think I have this worked out and I don't see how we can avoid a
>>>>>> memcpy with the old API when CODEC_CAP_DR1 is not set.  There would be
>>>>>> no other way to get the data into the correct output buffer.
>>>>>>
>>>>>> other questions:
>>>>>>
>>>>>> 1. Should AVCodecContext.user_buffer be supported for video decoders?
>>>>> possible but this is seperate, lets not entangle this patch with too many
>>>>> other changes
>>>>>
>>>>>
>>>>>> If so, should it be user_buffer[4] and user_buffer_size[4]?
>>>>> possible
>>>>>
>>>>>
>>>>>
>>>>>> 2. avcodec_default_get_buffer() supports allocating multiple internal
>>>>>> buffers.  How should that be handled if the buffer is supplied by the
>>>>>> user?  Don't support multiple buffers?  Use the user-supplied buffer
>>>>>> just for the first one?
>>>>> there are buffer hints (grep for hint in avcodec.h) that specify if a buffer
>>>>> will be reused/read/preserved/blah
>>>>> the user supplied buffer is likely just valid for this call and cannot be used
>>>>> for some cases of the hints. For what remains using the buffer on the first
>>>>> call only seems ok
>>>> I think I've implemented it in a way that will work even when the
>>>> various buffer hints are set.  This implementation will not use memcpy
>>>> in avcodec_decode_audio3() in the most common case of the decoder
>>>> supporting CODEC_CAP_DR1, only needing 1 buffer, and not needing a
>>>> buffer larger than AVCODEC_MAX_AUDIO_FRAME_SIZE.
>>>>
>>>> One thing I'm unsure of is whether to truncate output if it is too large
>>>> for avcodec_decode_audio3() (which is done in this patch) or to return
>>>> an error instead.
>>> I think its better to tell the user straight through an error that there is a
>>> problem instead of generating output that contains randomly truncated packets
>> Ok. New patch.
>>
>> -Justin

[...]
>> @@ -2763,7 +2813,7 @@ typedef struct AVCodec {
>>      int (*init)(AVCodecContext *);
>>      int (*encode)(AVCodecContext *, uint8_t *buf, int buf_size, void *data);
>>      int (*close)(AVCodecContext *);
>> -    int (*decode)(AVCodecContext *, void *outdata, int *outdata_size, AVPacket *avpkt);
>> +    int (*decode)(AVCodecContext *, void *outdata, int *got_output_ptr, AVPacket *avpkt);
>>      /**
>>       * Codec capabilities.
>>       * see CODEC_CAP_*
> 
> cosmetic

yeah yeah. I do want to change it though. :)  The size won't be needed
by anything after the audio API is changed.  And I still don't know why
the video decoders set it to sizeof(AVFrame) or sizeof(AVPicture).

> 
> [...]
>>  int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>>      int i;
>>      int w= s->width;
>>      int h= s->height;
>> +    int is_video = (s->codec_type == AVMEDIA_TYPE_VIDEO);
>>      InternalBuffer *buf;
>>      int *picture_number;
>>  
>> @@ -235,7 +258,8 @@ int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>>          return -1;
>>      }
>>  
>> -    if(av_image_check_size(w, h, 0, s))
>> +    if(( is_video && av_image_check_size(w, h, 0, s)) ||
>> +       (!is_video && audio_check_size(s->channels, pic->nb_samples, s->sample_fmt)))
>>          return -1;
>>  
>>      if(s->internal_buffer==NULL){
>> @@ -253,17 +277,30 @@ int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>>      picture_number= &(((InternalBuffer*)s->internal_buffer)[INTERNAL_BUFFER_SIZE]).last_pic_num; //FIXME ugly hack
>>      (*picture_number)++;
>>  
>> -    if(buf->base[0] && (buf->width != w || buf->height != h || buf->pix_fmt != s->pix_fmt)){
>> +    if (buf->base[0]) {
>> +        if (is_video && (buf->width != w || buf->height != h || buf->pix_fmt != s->pix_fmt)){
>>          for(i=0; i<4; i++){
>>              av_freep(&buf->base[i]);
>>              buf->data[i]= NULL;
>>          }
>> +        } else if (!is_video && (buf->channels   != s->channels     ||
>> +                                 buf->nb_samples != pic->nb_samples ||
>> +                                 buf->sample_fmt != s->sample_fmt)) {
>> +            if (buf->base[0] == s->user_buffer) {
>> +                s->user_buffer_in_use = 0;
>> +                buf->base[0] = NULL;
>> +            } else {
>> +                av_freep(&buf->base[0]);
>> +            }
>> +            buf->data[0] = NULL;
>> +        }
>>      }
>>  
>>      if(buf->base[0]){
>>          pic->age= *picture_number - buf->last_pic_num;
>>          buf->last_pic_num= *picture_number;
>>      }else{
>> +        if (is_video) {
>>          int h_chroma_shift, v_chroma_shift;
>>          int size[4] = {0};
>>          int tmpsize;
>> @@ -327,6 +364,28 @@ int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>>          buf->height = s->height;
>>          buf->pix_fmt= s->pix_fmt;
>>          pic->age= 256*256*256*64;
>> +        } else { /* audio */
>> +            int buf_size;
>> +
>> +            buf->last_pic_num = -256*256*256*64;
>> +
>> +            buf_size = pic->nb_samples * s->channels *
>> +                       (av_get_bits_per_sample_format(s->sample_fmt) / 8);
>> +
>> +            if (s->user_buffer && !s->user_buffer_in_use && s->user_buffer_size >= buf_size) {
>> +                buf->base[0] = s->user_buffer;
>> +                s->user_buffer_in_use = 1;
>> +            } else {
>> +                buf->base[0] = av_mallocz(buf_size);
>> +                if (!buf->base[0])
>> +                    return AVERROR(ENOMEM);
>> +            }
>> +
>> +            buf->data[0]    = buf->base[0];
>> +            buf->channels   = s->channels;
>> +            buf->nb_samples = pic->nb_samples;
>> +            buf->sample_fmt = s->sample_fmt;
>> +        }
>>      }
>>      pic->type= FF_BUFFER_TYPE_INTERNAL;
>>
> 
>> @@ -360,9 +419,15 @@ void avcodec_default_release_buffer(AVCodecContext *s, AVFrame *pic){
>>      }
>>      assert(i < s->internal_buffer_count);
>>      s->internal_buffer_count--;
>> +    if (buf->base[0] == s->user_buffer) {
>> +        assert(s->user_buffer_in_use);
>> +        s->user_buffer_in_use = 0;
>> +        buf->base[0] = NULL;
>> +    } else {
>>      last = &((InternalBuffer*)s->internal_buffer)[s->internal_buffer_count];
>>  
>>      FFSWAP(InternalBuffer, *buf, *last);
>> +    }
>>  
>>      for(i=0; i<4; i++){
>>          pic->data[i]=NULL;
> 
> i dont see how this could work
> the buffer used and returned by the previous decode() is put in a que by the
> user app and user_buffer is set to a new buffer.
> also you appear to end up calling av_free()
> on user supplied buffers

Well, I meant to disallow that, but the documentation I put just says
the user cannot free or change the data while user_buffer_in_use is set.
 I didn't consider the user replacing it with a new buffer.  But at any
rate, if that should be allowed, things get more complicated.  I'll need
to add a flag or something to indicate each user-supplied buffer.  I'll
work on it.

> 
>> @@ -401,9 +466,14 @@ int avcodec_default_reget_buffer(AVCodecContext *s, AVFrame *pic){
>>      /* Allocate new frame */
>>      if (s->get_buffer(s, pic))
>>          return -1;
>> -    /* Copy image data from old buffer to new buffer */
>> +    /* Copy frame data from old buffer to new buffer */
>> +    if (s->codec_type == AVMEDIA_TYPE_VIDEO) {
>>      av_picture_copy((AVPicture*)pic, (AVPicture*)&temp_pic, s->pix_fmt, s->width,
>>               s->height);
>> +    } else if (s->codec_type == AVMEDIA_TYPE_AUDIO) {
>> +        memcpy(pic->data[0], temp_pic.data[0], s->channels * pic->nb_samples *
>> +               (av_get_bits_per_sample_format(s->sample_fmt) / 8));
>> +    }
>>      s->release_buffer(s, &temp_pic); // Release old frame
>>      return 0;
>>  }
>> @@ -498,7 +568,6 @@ int attribute_align_arg avcodec_open(AVCodecContext *avctx, AVCodec *codec)
>>          avcodec_set_dimensions(avctx, 0, 0);
>>      }
>>  
>> -#define SANE_NB_CHANNELS 128U
>>      if (avctx->channels > SANE_NB_CHANNELS) {
>>          ret = AVERROR(EINVAL);
>>          goto free_and_end;
>> @@ -642,31 +711,60 @@ int attribute_align_arg avcodec_decode_audio2(AVCodecContext *avctx, int16_t *sa
>>  
>>      return avcodec_decode_audio3(avctx, samples, frame_size_ptr, &avpkt);
>>  }
>> -#endif
>>  
>>  int attribute_align_arg avcodec_decode_audio3(AVCodecContext *avctx, int16_t *samples,
>>                           int *frame_size_ptr,
>>                           AVPacket *avpkt)
>>  {
>> -    int ret;
>> +    AVFrame frame;
>> +    int ret, got_frame = 0;
>>  
>> -    if((avctx->codec->capabilities & CODEC_CAP_DELAY) || avpkt->size){
>> -        //FIXME remove the check below _after_ ensuring that all audio check that the available space is enough
>> -        if(*frame_size_ptr < AVCODEC_MAX_AUDIO_FRAME_SIZE){
>> -            av_log(avctx, AV_LOG_ERROR, "buffer smaller than AVCODEC_MAX_AUDIO_FRAME_SIZE\n");
>> -            return -1;
>> -        }
>> -        if(*frame_size_ptr < FF_MIN_BUFFER_SIZE ||
>> -        *frame_size_ptr < avctx->channels * avctx->frame_size * sizeof(int16_t)){
>> -            av_log(avctx, AV_LOG_ERROR, "buffer %d too small\n", *frame_size_ptr);
>> -            return -1;
> 
>> +    avcodec_get_frame_defaults(&frame);
> 
> is this needed?
> if yes what speed effect does it have with small frames

No, it seems it is not needed.

Thanks,
Justin