[FFmpeg-devel] [PATCH+RFC] AVFrame for audio

Sun Oct 24 02:21:43 CEST 2010

On Sun, Oct 17, 2010 at 05:22:54PM -0400, Justin Ruggles wrote:
> Michael Niedermayer wrote:
> 
> > On Sat, Oct 16, 2010 at 04:12:26PM -0400, Justin Ruggles wrote:
> >> Michael Niedermayer wrote:
> >>
> >>> On Fri, Oct 15, 2010 at 06:35:01PM -0400, Justin Ruggles wrote:
> >>>> Justin Ruggles wrote:
> >>>>
> >>>>> Michael Niedermayer wrote:
> >>>>>
> >>>>>> On Wed, Oct 13, 2010 at 07:52:12PM -0400, Justin Ruggles wrote:
> >>>>>>> Michael Niedermayer wrote:
> >>>>>>>
> >>>>>>>> On Wed, Oct 06, 2010 at 11:05:26AM -0400, Justin Ruggles wrote:
> >>>>>>>>> Michael Niedermayer wrote:
> >>>>>>>>>
> >>>>>>>>>> On Tue, Oct 05, 2010 at 04:55:12PM -0400, Justin Ruggles wrote:
> >>>>>>>>>>> Michael Niedermayer wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Sep 29, 2010 at 09:20:04PM -0400, Justin Ruggles wrote:
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Peter Ross wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, Sep 02, 2010 at 07:11:37PM -0400, Justin Ruggles wrote:
> >>>>>>>>>>>>>>> Hi,
> >>>>>>>> [...]
> >>>>>>>>>>>> [...]
> >>>>>>>>>>>>> @@ -644,29 +677,49 @@
> >>>>>>>>>>>>>  }
> >>>>>>>>>>>>>  #endif
> >>>>>>>>>>>>>  
> >>>>>>>>>>>>> +#if LIBAVCODEC_VERSION_MAJOR < 53
> >>>>>>>>>>>>>  int attribute_align_arg avcodec_decode_audio3(AVCodecContext *avctx, int16_t *samples,
> >>>>>>>>>>>>>                           int *frame_size_ptr,
> >>>>>>>>>>>>>                           AVPacket *avpkt)
> >>>>>>>>>>>>>  {
> >>>>>>>>>>>>> +    AVFrame frame;
> >>>>>>>>>>>>> +    int ret, got_frame = 0;
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +    avcodec_get_frame_defaults(&frame);
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +    ret = avcodec_decode_audio4(avctx, &frame, &got_frame, avpkt);
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +    if (ret >= 0 && got_frame) {
> >>>>>>>>>>>>> +        *frame_size_ptr = frame.nb_samples * avctx->channels *
> >>>>>>>>>>>>> +                          (av_get_bits_per_sample_format(avctx->sample_fmt) / 8);
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +        /* ensure data will fit in the output buffer */
> >>>>>>>>>>>>> +        if (*frame_size_ptr > AVCODEC_MAX_AUDIO_FRAME_SIZE) {
> >>>>>>>>>>>>> +            av_log(avctx, AV_LOG_WARNING, "avcodec_decode_audio3 samples "
> >>>>>>>>>>>>> +                   "truncated to AVCODEC_MAX_AUDIO_FRAME_SIZE\n");
> >>>>>>>>>>>>> +            *frame_size_ptr = AVCODEC_MAX_AUDIO_FRAME_SIZE;
> >>>>>>>>>>>>> +        }
> >>>>>>>>>>>>> +
> >>>>>>>>>>>>> +        memcpy(samples, frame.data[0], *frame_size_ptr);
> >>>>>>>>>>>>  the default get_buffer() should return the appropriate
> >>>>>>>>>>>> buffer for this case.
> >>>>>>>>>>> I'm sorry, I don't understand your comment.
> >>>>>>>>>> i mean (non functional psseudocode below to explain the idea)
> >>>>>>>>>> avcodec_decode_audio3(){
> >>>>>>>>>>     avctx->foobar= samples;
> >>>>>>>>>>     ret = avcodec_decode_audio4(avctx, &frame, &got_frame, avpkt);
> >>>>>>>>>>     ...
> >>>>>>>>>>     assert(samples == frame.data[0]);
> >>>>>>>>>>
> >>>>>>>>>> -----
> >>>>>>>>>> default_get_buffer(){
> >>>>>>>>>>     if(avctx->foobar)
> >>>>>>>>>>         return avctx->foobar;
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> (and yes this cannot work for all theoretical decoders)
> >>>>>>>>> I think I get it.  So avctx->foobar would be an optional user-supplied
> >>>>>>>>> buffer (avctx->user_buffer?) that default_get_buffer() would return if
> >>>>>>>>> it is non-NULL, right?
> >>>>>>>> yes
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> The problem I see is this:
> >>>>>>>>> avcodec_decode_audio3() would use avcodec_decode_audio4().
> >>>>>>>>> avcodec_decode_audio4() allocates as large a buffer as is needed through
> >>>>>>>>> get_buffer(), but the avcodec_decode_audio3() API only requires the
> >>>>>>>>> user-supplied buffer to be AVCODEC_MAX_AUDIO_FRAME_SIZE.  Couldn't this
> >>>>>>>>> lead to the decoder writing past the end of a user-supplied buffer if it
> >>>>>>>>> isn't large enough?  I guess we could also add a field
> >>>>>>>>> avctx->user_buffer_size?
> >>>>>>>> yes, of course
> >>>>>>>> it was just a rough idea ...
> >>>>>>> I'm running into some questions trying to implement the rough idea.  The
> >>>>>>> only way I can see this working smoothly is if avcodec_decode_audio3()
> >>>>>>> always sets get/release_buffer to default.  Also, either all audio
> >>>>>>> decoders will have to support CODEC_CAP_DR1 (i.e. they always use
> >>>>>>> get/release_buffer) or there needs to be a fallback that will memcpy
> >>>>>>> into the user buffer if CODEC_CAP_DR1 is not supported.
> >>>>>> old API decoders surely dont need to copy with old API.
> >>>>>> old API decoders surely dont need to copy with new API if the api can provide
> >>>>>> a buffer to the decoder (this can be through function argument like its done
> >>>>>> currently)
> >>>>>> new API decoders surely dont need to copy with new API because otherwise the
> >>>>>> API sucks and needs more work
> >>>>>> whats left is new API decoders and used with old API and for this get_buffer()
> >>>>>> should return the user supplied buffer if its large enough and fail if its not
> >>>>>> large enough.
> >>>>>> The case where the user overrides get_buffer() and supplies a user specified
> >>>>>> buffer which its own code doesnt use is a case that id consider user error.
> >>>>> I think I might have been misinterpreting the API.  For video decoders,
> >>>>> what does it mean as far as buffer allocation when CODEC_CAP_DR1 is not set?
> >>>> So I think I have this worked out and I don't see how we can avoid a
> >>>> memcpy with the old API when CODEC_CAP_DR1 is not set.  There would be
> >>>> no other way to get the data into the correct output buffer.
> >>>>
> >>>> other questions:
> >>>>
> >>>> 1. Should AVCodecContext.user_buffer be supported for video decoders?
> >>> possible but this is seperate, lets not entangle this patch with too many
> >>> other changes
> >>>
> >>>
> >>>> If so, should it be user_buffer[4] and user_buffer_size[4]?
> >>> possible
> >>>
> >>>
> >>>
> >>>> 2. avcodec_default_get_buffer() supports allocating multiple internal
> >>>> buffers.  How should that be handled if the buffer is supplied by the
> >>>> user?  Don't support multiple buffers?  Use the user-supplied buffer
> >>>> just for the first one?
> >>> there are buffer hints (grep for hint in avcodec.h) that specify if a buffer
> >>> will be reused/read/preserved/blah
> >>> the user supplied buffer is likely just valid for this call and cannot be used
> >>> for some cases of the hints. For what remains using the buffer on the first
> >>> call only seems ok
> >> I think I've implemented it in a way that will work even when the
> >> various buffer hints are set.  This implementation will not use memcpy
> >> in avcodec_decode_audio3() in the most common case of the decoder
> >> supporting CODEC_CAP_DR1, only needing 1 buffer, and not needing a
> >> buffer larger than AVCODEC_MAX_AUDIO_FRAME_SIZE.
> >>
> > 
> >> One thing I'm unsure of is whether to truncate output if it is too large
> >> for avcodec_decode_audio3() (which is done in this patch) or to return
> >> an error instead.
> > 
> > I think its better to tell the user straight through an error that there is a
> > problem instead of generating output that contains randomly truncated packets
> 
> Ok. New patch.
> 
> -Justin
> 

>  doc/APIchanges       |    9 +++
>  libavcodec/avcodec.h |  117 ++++++++++++++++++++++++++++++++++++++++++--
>  libavcodec/pcm.c     |   41 ++++++++++++---
>  libavcodec/utils.c   |  134 ++++++++++++++++++++++++++++++++++++++++++++-------
>  4 files changed, 270 insertions(+), 31 deletions(-)
> 4f0b691de15d4dd07d105eee23ebda4415e31652  avcodec_decode_audio4.patch
> diff --git a/doc/APIchanges b/doc/APIchanges
> index 4155d32..a39d9fd 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -13,6 +13,15 @@ libavutil:   2009-03-08
>  
>  API changes, most recent first:
>  
> +2010-XX-XX - rXXXXX - lavc 52.92.0 - AVFrame and avcodec_decode_audio
> +  Add nb_samples field to AVFrame.
> +  Add user_buffer, user_buffer_size, and user_buffer_in_use fields to AVCodecContext.
> +  Deprecate AVCODEC_MAX_AUDIO_FRAME_SIZE.
> +  Deprecate avcodec_decode_audio3() in favor of avcodec_decode_audio4().
> +  avcodec_decode_audio4() writes output samples to an AVFrame, which gives the
> +  audio decoders the ability to use get/release/reget_buffer() to get an
> +  output buffer.
> +
>  2010-10-10 - r25441 - lavfi 1.49.0 - AVFilterLink.time_base
>    Add time_base field to AVFilterLink.
>  
> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> index 4bddbaa..9196cda 100644
> --- a/libavcodec/avcodec.h
> +++ b/libavcodec/avcodec.h
> @@ -31,7 +31,7 @@
>  #include "libavutil/cpu.h"
>  
>  #define LIBAVCODEC_VERSION_MAJOR 52
> -#define LIBAVCODEC_VERSION_MINOR 92
> +#define LIBAVCODEC_VERSION_MINOR 93
>  #define LIBAVCODEC_VERSION_MICRO  0
>  
>  #define LIBAVCODEC_VERSION_INT  AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \
> @@ -467,8 +467,10 @@ enum SampleFormat {
>                                            CH_FRONT_LEFT_OF_CENTER|CH_FRONT_RIGHT_OF_CENTER)
>  #define CH_LAYOUT_STEREO_DOWNMIX    (CH_STEREO_LEFT|CH_STEREO_RIGHT)
>  
> +#if FF_API_AUDIO_OLD
>  /* in bytes */
>  #define AVCODEC_MAX_AUDIO_FRAME_SIZE 192000 // 1 second of 48khz 32bit audio
> +#endif
>  
>  /**
>   * Required number of additionally allocated bytes at the end of the input bitstream for decoding.
> @@ -988,7 +990,13 @@ typedef struct AVPanScan{
>       * - decoding: Set by libavcodec\
>       */\
>      void *hwaccel_picture_private;\
> -
> +\
> +    /**\
> +     * number of audio samples (per channel) described by this frame\
> +     * - encoding: Set by user.\
> +     * - decoding: Set by libavcodec.\
> +     */\
> +    int nb_samples;\
>  
>  #define FF_QSCALE_TYPE_MPEG1 0
>  #define FF_QSCALE_TYPE_MPEG2 1
> @@ -2744,6 +2752,48 @@ typedef struct AVCodecContext {
>       * - decoding: unused
>       */
>      int lpc_passes;
> +
> +    /**
> +     * User-allocated audio decoder output buffer & buffer size
> +     * If user_buffer is non-NULL and is large enough,
> +     * avcodec_default_get_buffer() may user it as an internal buffer instead
> +     * of allocating its own. This only works with decoders that support
> +     * CODEC_CAP_DR1. If user_buffer is currently in use by libavcodec,
> +     * user_buffer_in_use will be set to non-zero.
> +     *
> +     * @note The user may not use this field when using avcodec_decode_audio3()
> +     *       or avcodec_decode_audio2().
> +     *
> +     * - encoding: unused
> +     * - decoding: Set/allocated/freed by user
> +     */
> +    uint8_t *user_buffer;
> +
> +    /**
> +     * Size, in bytes, of AVCodecContext.user_buffer.
> +     *
> +     * @note The user may not use this field when using avcodec_decode_audio3()
> +     *       or avcodec_decode_audio2().
> +     *
> +     * - encoding: unused
> +     * - decoding: Set by user
> +     */
> +    int user_buffer_size;
> +
> +    /**
> +     * Indicates whether AVCodecContext.user_buffer is currently being used
> +     * by libavcodec.
> +     * If non-zero, the user may neither free nor modify the data in
> +     * AVCodecContext.user_buffer.
> +     *
> +     * @note When using avcodec_decode_audio3() or avcodec_decode_audio2(),
> +     *       the user may neither free nor modify the data in
> +     *       AVCodecContext.user_buffer no matter what value this field has.
> +     *
> +     * - encoding: unused
> +     * - decoding: Set by libavcodec
> +     */
> +    int user_buffer_in_use;
>  } AVCodecContext;
>  
>  /**

> @@ -2763,7 +2813,7 @@ typedef struct AVCodec {
>      int (*init)(AVCodecContext *);
>      int (*encode)(AVCodecContext *, uint8_t *buf, int buf_size, void *data);
>      int (*close)(AVCodecContext *);
> -    int (*decode)(AVCodecContext *, void *outdata, int *outdata_size, AVPacket *avpkt);
> +    int (*decode)(AVCodecContext *, void *outdata, int *got_output_ptr, AVPacket *avpkt);
>      /**
>       * Codec capabilities.
>       * see CODEC_CAP_*

cosmetic

[...]
>  int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>      int i;
>      int w= s->width;
>      int h= s->height;
> +    int is_video = (s->codec_type == AVMEDIA_TYPE_VIDEO);
>      InternalBuffer *buf;
>      int *picture_number;
>  
> @@ -235,7 +258,8 @@ int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>          return -1;
>      }
>  
> -    if(av_image_check_size(w, h, 0, s))
> +    if(( is_video && av_image_check_size(w, h, 0, s)) ||
> +       (!is_video && audio_check_size(s->channels, pic->nb_samples, s->sample_fmt)))
>          return -1;
>  
>      if(s->internal_buffer==NULL){
> @@ -253,17 +277,30 @@ int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>      picture_number= &(((InternalBuffer*)s->internal_buffer)[INTERNAL_BUFFER_SIZE]).last_pic_num; //FIXME ugly hack
>      (*picture_number)++;
>  
> -    if(buf->base[0] && (buf->width != w || buf->height != h || buf->pix_fmt != s->pix_fmt)){
> +    if (buf->base[0]) {
> +        if (is_video && (buf->width != w || buf->height != h || buf->pix_fmt != s->pix_fmt)){
>          for(i=0; i<4; i++){
>              av_freep(&buf->base[i]);
>              buf->data[i]= NULL;
>          }
> +        } else if (!is_video && (buf->channels   != s->channels     ||
> +                                 buf->nb_samples != pic->nb_samples ||
> +                                 buf->sample_fmt != s->sample_fmt)) {
> +            if (buf->base[0] == s->user_buffer) {
> +                s->user_buffer_in_use = 0;
> +                buf->base[0] = NULL;
> +            } else {
> +                av_freep(&buf->base[0]);
> +            }
> +            buf->data[0] = NULL;
> +        }
>      }
>  
>      if(buf->base[0]){
>          pic->age= *picture_number - buf->last_pic_num;
>          buf->last_pic_num= *picture_number;
>      }else{
> +        if (is_video) {
>          int h_chroma_shift, v_chroma_shift;
>          int size[4] = {0};
>          int tmpsize;
> @@ -327,6 +364,28 @@ int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic){
>          buf->height = s->height;
>          buf->pix_fmt= s->pix_fmt;
>          pic->age= 256*256*256*64;
> +        } else { /* audio */
> +            int buf_size;
> +
> +            buf->last_pic_num = -256*256*256*64;
> +
> +            buf_size = pic->nb_samples * s->channels *
> +                       (av_get_bits_per_sample_format(s->sample_fmt) / 8);
> +
> +            if (s->user_buffer && !s->user_buffer_in_use && s->user_buffer_size >= buf_size) {
> +                buf->base[0] = s->user_buffer;
> +                s->user_buffer_in_use = 1;
> +            } else {
> +                buf->base[0] = av_mallocz(buf_size);
> +                if (!buf->base[0])
> +                    return AVERROR(ENOMEM);
> +            }
> +
> +            buf->data[0]    = buf->base[0];
> +            buf->channels   = s->channels;
> +            buf->nb_samples = pic->nb_samples;
> +            buf->sample_fmt = s->sample_fmt;
> +        }
>      }
>      pic->type= FF_BUFFER_TYPE_INTERNAL;
>

> @@ -360,9 +419,15 @@ void avcodec_default_release_buffer(AVCodecContext *s, AVFrame *pic){
>      }
>      assert(i < s->internal_buffer_count);
>      s->internal_buffer_count--;
> +    if (buf->base[0] == s->user_buffer) {
> +        assert(s->user_buffer_in_use);
> +        s->user_buffer_in_use = 0;
> +        buf->base[0] = NULL;
> +    } else {
>      last = &((InternalBuffer*)s->internal_buffer)[s->internal_buffer_count];
>  
>      FFSWAP(InternalBuffer, *buf, *last);
> +    }
>  
>      for(i=0; i<4; i++){
>          pic->data[i]=NULL;

i dont see how this could work
the buffer used and returned by the previous decode() is put in a que by the
user app and user_buffer is set to a new buffer.
also you appear to end up calling av_free()
on user supplied buffers

> @@ -401,9 +466,14 @@ int avcodec_default_reget_buffer(AVCodecContext *s, AVFrame *pic){
>      /* Allocate new frame */
>      if (s->get_buffer(s, pic))
>          return -1;
> -    /* Copy image data from old buffer to new buffer */
> +    /* Copy frame data from old buffer to new buffer */
> +    if (s->codec_type == AVMEDIA_TYPE_VIDEO) {
>      av_picture_copy((AVPicture*)pic, (AVPicture*)&temp_pic, s->pix_fmt, s->width,
>               s->height);
> +    } else if (s->codec_type == AVMEDIA_TYPE_AUDIO) {
> +        memcpy(pic->data[0], temp_pic.data[0], s->channels * pic->nb_samples *
> +               (av_get_bits_per_sample_format(s->sample_fmt) / 8));
> +    }
>      s->release_buffer(s, &temp_pic); // Release old frame
>      return 0;
>  }
> @@ -498,7 +568,6 @@ int attribute_align_arg avcodec_open(AVCodecContext *avctx, AVCodec *codec)
>          avcodec_set_dimensions(avctx, 0, 0);
>      }
>  
> -#define SANE_NB_CHANNELS 128U
>      if (avctx->channels > SANE_NB_CHANNELS) {
>          ret = AVERROR(EINVAL);
>          goto free_and_end;
> @@ -642,31 +711,60 @@ int attribute_align_arg avcodec_decode_audio2(AVCodecContext *avctx, int16_t *sa
>  
>      return avcodec_decode_audio3(avctx, samples, frame_size_ptr, &avpkt);
>  }
> -#endif
>  
>  int attribute_align_arg avcodec_decode_audio3(AVCodecContext *avctx, int16_t *samples,
>                           int *frame_size_ptr,
>                           AVPacket *avpkt)
>  {
> -    int ret;
> +    AVFrame frame;
> +    int ret, got_frame = 0;
>  
> -    if((avctx->codec->capabilities & CODEC_CAP_DELAY) || avpkt->size){
> -        //FIXME remove the check below _after_ ensuring that all audio check that the available space is enough
> -        if(*frame_size_ptr < AVCODEC_MAX_AUDIO_FRAME_SIZE){
> -            av_log(avctx, AV_LOG_ERROR, "buffer smaller than AVCODEC_MAX_AUDIO_FRAME_SIZE\n");
> -            return -1;
> -        }
> -        if(*frame_size_ptr < FF_MIN_BUFFER_SIZE ||
> -        *frame_size_ptr < avctx->channels * avctx->frame_size * sizeof(int16_t)){
> -            av_log(avctx, AV_LOG_ERROR, "buffer %d too small\n", *frame_size_ptr);
> -            return -1;

> +    avcodec_get_frame_defaults(&frame);

is this needed?
if yes what speed effect does it have with small frames

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I know you won't believe me, but the highest form of Human Excellence is
to question oneself and others. -- Socrates
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101024/3487a5a4/attachment.pgp>