[FFmpeg-devel] [PATCH] avformat/dv: fix timestamps of audio packets in case of dropped corrupt audio frames

Mon Nov 2 16:47:58 EET 2020

On Sun, Nov 01, 2020 at 09:58:43PM +0100, Marton Balint wrote:
> 
> 
> On Sun, 1 Nov 2020, Michael Niedermayer wrote:
> 
> > On Sat, Oct 31, 2020 at 05:56:24PM +0100, Marton Balint wrote:
> > > Fixes out of sync timestamps in ticket #8762.
> > > 
> > > Signed-off-by: Marton Balint <cus at passwd.hu>
> > > ---
> > >  libavformat/dv.c       | 16 ++--------------
> > >  tests/ref/seek/lavf-dv | 18 +++++++++---------
> > >  2 files changed, 11 insertions(+), 23 deletions(-)
> > > 
> > > diff --git a/libavformat/dv.c b/libavformat/dv.c
> > > index 3e0d12c0e3..26a78139f5 100644
> > > --- a/libavformat/dv.c
> > > +++ b/libavformat/dv.c
> > > @@ -49,7 +49,6 @@ struct DVDemuxContext {
> > >      uint8_t           audio_buf[4][8192];
> > >      int               ach;
> > >      int               frames;
> > > -    uint64_t          abytes;
> > >  };
> > > 
> > >  static inline uint16_t dv_audio_12to16(uint16_t sample)
> > > @@ -258,7 +257,7 @@ static int dv_extract_audio_info(DVDemuxContext *c, const uint8_t *frame)
> > >              c->ast[i] = avformat_new_stream(c->fctx, NULL);
> > >              if (!c->ast[i])
> > >                  break;
> > > -            avpriv_set_pts_info(c->ast[i], 64, 1, 30000);
> > > +            avpriv_set_pts_info(c->ast[i], 64, c->sys->time_base.num, c->sys->time_base.den);
> > >              c->ast[i]->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
> > >              c->ast[i]->codecpar->codec_id   = AV_CODEC_ID_PCM_S16LE;
> > > 
> > > @@ -387,8 +386,7 @@ int avpriv_dv_produce_packet(DVDemuxContext *c, AVPacket *pkt,
> > >      for (i = 0; i < c->ach; i++) {
> > >          c->audio_pkt[i].pos  = pos;
> > >          c->audio_pkt[i].size = size;
> > > -        c->audio_pkt[i].pts  = c->abytes * 30000 * 8 /
> > > -                               c->ast[i]->codecpar->bit_rate;
> > > +        c->audio_pkt[i].pts  = (c->sys->height == 720) ? (c->frames & ~1) : c->frames;
> > >          ppcm[i] = c->audio_buf[i];
> > >      }
> > >      if (c->ach)
> > > @@ -401,10 +399,7 @@ int avpriv_dv_produce_packet(DVDemuxContext *c, AVPacket *pkt,
> > >              c->audio_pkt[2].size = c->audio_pkt[3].size = 0;
> > >          } else {
> > >              c->audio_pkt[0].size = c->audio_pkt[1].size = 0;
> > > -            c->abytes           += size;
> > >          }
> > > -    } else {
> > > -        c->abytes += size;
> > >      }
> > > 
> > >      /* Now it's time to return video packet */
> > 
> > Please correct me if iam wrong but
> > in cases where no audio is missing or damaged, this would also ignore how much
> > audio is in each packet. So you could have lets say a timestamp difference
> > of excatly 1 second between 2 packets while their is actually not exactly
> > 1 second worth of audio samples between them.
> 
> This is true, by using the frame counter (and the video time base) for
> audio, we lose some audio packet timestamp precision inherently. However I
> don't consider this a problem, audio timestamps do not have to be sample
> accurate, for most formats they are not. 

> Also it is not practical to keep
> track of how many samples are there in the packets, for example when you do
> seeking, obviously you can't read all the audio data before the seek point
> to get a precise sample accurate timestamp.

Its true that with seeking there is not enough information for sample precisse
timestamps. But from packet to packet as long as no seek happened there is.
My concern was more about something like significant frame to frame
differences in audio sample numbers. 
Because if some hw or sw generates this we would produce packets of
identical duration which differ substantially in number of samples and
that would not be handled well in any scenario that accepted the timestamps
and durations as exact.

Maybe this never occurs and in that case your patch should be a good idea
but if it does happen then some code would be needed to deal with that.
It is detectable when sample counts do not match what is expected.

That said, i would consider a fix for #8762 to produce correct audio in
all cases including wav/pcm/mov/... output and not just when the output
can store "corrupted"/"sparse" audio.
Also to me returning the data from the input file which would represent audio
if it was not corrupt seems to be somehow the "correct" thing to do.
Maybe this never contains any useful data then it doesnt matter in
reality but still it feels a bit odd to fix just the timestamps.
But maybe that was not what you intended anyway?

thx
[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If the United States is serious about tackling the national security threats 
related to an insecure 5G network, it needs to rethink the extent to which it
values corporate profits and government espionage over security.-Bruce Schneier
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20201102/8bb51acb/attachment.sig>