[FFmpeg-devel] [PATCH]?avcodec_decode_audio3?and?multiple?frames in a packet

Wed Sep 16 21:58:16 CEST 2009

On Wed, Sep 16, 2009 at 09:24:24PM +0200, Sascha Sommer wrote:
> Hi,
> 
> On Mittwoch, 16. September 2009, Michael Niedermayer wrote:
> > On Wed, Sep 16, 2009 at 05:51:03PM +0200, Sascha Sommer wrote:
> > > Hi,
> > >
> > > On Mittwoch, 16. September 2009, Michael Niedermayer wrote:
> > > > On Wed, Sep 16, 2009 at 02:52:29PM +0200, Sascha Sommer wrote:
> > > > > Hi,
> > > > >
> > > > > On Samstag, 12. September 2009, Justin Ruggles wrote:
> > > > > > Michael Niedermayer wrote:
> > > > > > > On Fri, Sep 11, 2009 at 12:13:02PM +0200, Sascha Sommer wrote:
> > > > > > > [...]
> > > > > > >
> > > > > > >> Previously a value of 0 meant that no frame was decoded.
> > > > > > >
> > > > > > > no
> > > > > > > you read the docs backward, it says if no frame was decoded
> > > > > > > return 0 it does not say that 0 means no frames have been
> > > > > > > decoded, it could equally well mean 0 bytes used
> > > > > >
> > > > > > Ah, good.  So, although the current text is technically correct if
> > > > > > interpreted that way, it is ambiguous.  Why do we need to have a 0
> > > > > > return value also possibly mean no frames have been decoded?   If
> > > > > > frame_size_ptr is set to 0, that always means no frames have been
> > > > > > decoded, without regard to the return value.  And a return value of
> > > > > > 0 should mean zero bytes were used, without regard to what
> > > > > > frame_size_ptr is set to.  They seem mutually exclusive to me...
> > > > >
> > > > > I agree. The return value controls the number of input bytes,
> > > > > frame_size_ptr the number of output bytes. I don't see why 0 needs to
> > > > > be returned when no frame was outputted.
> > > >
> > > > what exactly did the decoder then do with the data?
> > > > and what was that data it did not decode?
> > >
> > > It maybe skipped it? For example when the packet contained only DSE
> > > syntax elements in AAC. I did not check the spec if this can happen or
> > > for what these Data Stream Elements are actually used but as we already
> > > found out ffmpeg does
> > > ? ? ? ? ? ? ? ? avpkt.data += ret;
> > > ? ? ? ? ? ? ? ? avpkt.size -= ret;
> > > So this will decode always the same data.
> >
> > yes
> > a packet that is input to a decoder MUST in general contain 1 frame.
> 
> 1 frame that can be decoded by the decoder, right?
> But can we under all circumstances know what the decoder produces out of this 
> 1 input frame?

no it can have an error and return a negative value representing the type
of error

> 
> > Now, there are formats that use inseperable frames intermingled where
> > a decoder hs to be feeded with more than 1 frame.
> > packets that contain no frame at all would have a duration of 0, would
> > have dts equal to the next packet would _not_ have a pts because they
> > are not presented, practically no container could store them ...
> 
> Still there could be a more or less "raw" format X that contains only the 
> information that it contains an audio frame of format Y with len Z bytes. 
> Audio format Y could have variable samples per frame, allowing even 0 samples 

We also could consider changing the API once these X-Y-Z turns up in reality
besides my definition of frame contains a positive number of samples, not 0
and not negative

> (these frames could for example contain some decode tables for the next 
> frames)
> As format X is meant to be generic it does not know what audio format Y stores 
> in its frames and can't concatenate the frames so that always one frame is 
> outputted.
> 
> >
> > please dont change the API to support this, at least not without first
> > explaining me how it could work on the muxer side or even how the demuxer
> > side should deal with all the special cases for timestamps, it surely does
> > not currently
> 
> I think when the format is decoded, we only need to think about what comes out 
> of the decoder and here, when a new sample is outputted, the sum of all 
> previously outputted samples can be used to calculate the pts.

not in reality, no sound recording device has a perfect clock, the sample rate
is not exact nor is it perfectly constant.
you might be lucky of course that its good enough or that it has been
resampled before muxing but if not your sum of samples can differ from the
correct pts.
of course its good enough for a single packet but possibly not a whole file

> Input packets 
> with duration 0 do not need to result in an output packet, or how are they 
> currently handled?

like a bug in the demuxer, besides duration==0 means unknown duration in the
code not zero duration.

> Muxing the compressed data is another story but please let's figure out how to 
> do the decoding first.

both muxing and decoding work in ffmpeg since a few years, a change to the API
will have to keep both functioning. So what is it that you meant?

(yes i do feel a little upset about the API change discussion prior to
 ANY exlpanation why the current would be worse than the proposed)
I would really prefer if you first would describe a _real_ situation in
which the current is insufficent.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The misfortune of the wise is better than the prosperity of the fool.
-- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090916/3fdb5584/attachment.pgp>