[FFmpeg-devel] Timestamp problems when transcoding asf/wmav2/wmv3 to ts/aac/h264
Anders Rein
are at vizrt.com
Fri Dec 6 18:03:22 CET 2013
After transcoding a asf file with wmav2/wmv3 to a ts file with aac/h264
using the ffmpeg executable, the audio packet timestamps are wrong when
I again use libavformat to demux the file. I've tested with both the
ffmpeg aac encoder and libfdk_aac and the problem remains the same.
When I transcode I get the warning log message:
[aac @ 0xa55ee0] Queue input is backward in time
[mpegts @ 0xa56be0] Non-monotonous DTS in output stream 0:1; previous:
1089818, current: 1087511; changing to 1089819. This may result in
incorrect timestamps in the output file.
[mpegts @ 0xa56be0] Non-monotonous DTS in output stream 0:1; previous:
1089819, current: 1089431; changing to 1089820. This may result in
incorrect timestamps in the output file.
What is happening as far as I understand is that the wmav2 packets have
slightly wrong timestamps so that sometimes the dts gap is much smaller
than the actuall sample duration in the packets. wmav2 has much larger
frames than what is sent in to the aac encoder. When FFMpeg uses
ff_filter_frame_needs_framing to divide the big audio frames into
smaller frames for the aac encoder, the smaller frames at the end of the
large frames get timestamps larger than the next big frame from the asf
demuxer.
The ffmpeg executable solves this in write_frame(ffmpeg.c:545) by moving
the next packets 1 timestamp in front of the previous packet. This is
when the "Non-monotonous DTS" shows up. This work pretty well, and I can
play the file afterwards. The problem comes when I afterward run the
fille into my own software that uses libavformat. Some of the packets
that I then read from the file have non-monotonous increasing dts.
Putting a log line in the mpegts demuxer shows that the actuall dts in
the file is correct (only increasing), however somewhere in parse_packet
(libavformat/utils.c:1201) the timestamps are corrupted by
compute_pkt_fields.
I don't fully understand what is going on in parse_packet, but it seems
like with the help of the ac3 parser the packet is spilt into serveral
smaller packets and execpt for the first sub packet, the timestamps are
calculated using duration. This caluculation end up giving
non-monotonous increasing dts in packets returned to the public API. Can
anyone help shed some light on what is going on here?
More information about the ffmpeg-devel
mailing list