[FFmpeg-devel] [PATCH+RFC] AVFrame for audio

Wed Oct 6 17:05:26 CEST 2010

Michael Niedermayer wrote:

> On Tue, Oct 05, 2010 at 04:55:12PM -0400, Justin Ruggles wrote:
>> Michael Niedermayer wrote:
>>
>>> On Wed, Sep 29, 2010 at 09:20:04PM -0400, Justin Ruggles wrote:
>>>> Hi,
>>>>
>>>> Peter Ross wrote:
>>>>
>>>>> On Thu, Sep 02, 2010 at 07:11:37PM -0400, Justin Ruggles wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Peter Ross wrote:
>>>>>>
>>>>>>> To prototype use of audio in AVFrame, I have modified the PCM encoder/decoder
>>>>>>> and addded new public lavc functions.
>>>>>>>
>>>>>>> Existing SAMPLE_FMTs are mapping onto (AVFrame*)->data[0]. The obvious
>>>>>>> next step is to support planar audio, and splitting FF_COMMON_FRAME into
>>>>>>> common, audio- and video-specific parts.
>>>>>>>
>>>>>>> avctx->reget_audio_buffer() is implemened using a simple realloc. It
>>>>>>> probably makes sense to reuse the video InternalBuffer stuff, but i'm not
>>>>>>> sure.
>>>>>> Any progress on this?  I really like the idea.
>>>>> Unfortunately not. If you or anyone else wants to run with this patch, please do.
>>>> Here is a new patch+rfc to only change the audio decoding to behave like
>>>> video decoding.
>>>>
>>>> - adds nb_sample to AVFrame
>>>> - adapts get/reget_buffer to support video or audio media types
>>>> - example implementation for pcm decoder
>>>>
>>>> Once this part is reviewed and agreed upon, I'll work on the user-side
>>>> usage of avcodec_decode_audio4() and the tedious work of changing all
>>>> the audio decoders to the new API.
>>>>
>>>> Planar audio would definitely be nice to add later, but I'm not sure how
>>>> to go about it. maybe something like...
>>>>
>>>> - a new AVFrame field **audio_data
>>> it would be possible to seperate the planes by linesize[0] bytes
>>> iam not saying i prefer it just that it is possible
>> Do you mean using something analogous to av_image_fill_pointers() but
>> for audio?
> 
> i meant data[0] + channel*linesize[0] instead of audio_data
> but as said i dont know if this is a good idea

Yes, I understand.  My first thought was that you would have to
calculate all channel pointers in the decoders that process more than
one channel at a time, which would be redundant.  But if you have a
shared function to fill the channel pointers based on data[0],
linesize[0], channels, nb_samples, and sample_fmt it wouldn't be so bad.

> 
>>> [...]
>>>> @@ -235,7 +239,7 @@
>>>>          return -1;
>>>>      }
>>>>  
>>>> -    if(av_image_check_size(w, h, 0, s))
>>>> +    if (is_video && av_image_check_size(w, h, 0, s))
>>>>          return -1;
>>> audio parameters should be checked similar to image here
>> Ok, I'll add that.  Do you think there should be a public function
>> similar to av_image_check_size()?
> 
> non public is easier as we dont need to bikeshed the API

ok

> 
>>> [...]
>>>> @@ -644,29 +677,49 @@
>>>>  }
>>>>  #endif
>>>>  
>>>> +#if LIBAVCODEC_VERSION_MAJOR < 53
>>>>  int attribute_align_arg avcodec_decode_audio3(AVCodecContext *avctx, int16_t *samples,
>>>>                           int *frame_size_ptr,
>>>>                           AVPacket *avpkt)
>>>>  {
>>>> +    AVFrame frame;
>>>> +    int ret, got_frame = 0;
>>>> +
>>>> +    avcodec_get_frame_defaults(&frame);
>>>> +
>>>> +    ret = avcodec_decode_audio4(avctx, &frame, &got_frame, avpkt);
>>>> +
>>>> +    if (ret >= 0 && got_frame) {
>>>> +        *frame_size_ptr = frame.nb_samples * avctx->channels *
>>>> +                          (av_get_bits_per_sample_format(avctx->sample_fmt) / 8);
>>>> +
>>>> +        /* ensure data will fit in the output buffer */
>>>> +        if (*frame_size_ptr > AVCODEC_MAX_AUDIO_FRAME_SIZE) {
>>>> +            av_log(avctx, AV_LOG_WARNING, "avcodec_decode_audio3 samples "
>>>> +                   "truncated to AVCODEC_MAX_AUDIO_FRAME_SIZE\n");
>>>> +            *frame_size_ptr = AVCODEC_MAX_AUDIO_FRAME_SIZE;
>>>> +        }
>>>> +
>>>> +        memcpy(samples, frame.data[0], *frame_size_ptr);
>>>  the default get_buffer() should return the appropriate
>>> buffer for this case.
>> I'm sorry, I don't understand your comment.
> 
> i mean (non functional psseudocode below to explain the idea)
> avcodec_decode_audio3(){
>     avctx->foobar= samples;
>     ret = avcodec_decode_audio4(avctx, &frame, &got_frame, avpkt);
>     ...
>     assert(samples == frame.data[0]);
> 
> -----
> default_get_buffer(){
>     if(avctx->foobar)
>         return avctx->foobar;
> 
> 
> (and yes this cannot work for all theoretical decoders)

I think I get it.  So avctx->foobar would be an optional user-supplied
buffer (avctx->user_buffer?) that default_get_buffer() would return if
it is non-NULL, right?

The problem I see is this:
avcodec_decode_audio3() would use avcodec_decode_audio4().
avcodec_decode_audio4() allocates as large a buffer as is needed through
get_buffer(), but the avcodec_decode_audio3() API only requires the
user-supplied buffer to be AVCODEC_MAX_AUDIO_FRAME_SIZE.  Couldn't this
lead to the decoder writing past the end of a user-supplied buffer if it
isn't large enough?  I guess we could also add a field
avctx->user_buffer_size?

-Justin