[FFmpeg-devel] [PATCH][RFC] Add example seeking_while_remuxing.c

Andrey Utkin andrey.krieger.utkin at gmail.com
Mon Jan 27 03:58:00 CET 2014


I was requested to make up such a demo of ffmpeg API, showing how to seek while
serving "client connection". It turned out it's not so easy. I started to
recall a hell of issues i got into during work on input streams fallback.
These memories resulted in large comments.

I'm still not perfectly sure in all my statements in this snippet, although it
feels close to truth in the whole. To make it perfect and beneficial to all, i
decided to propose it for review and inclusion to official examples.

I would be glad to hear opinions about used and described approaches, and all
statements in the comments.

In this example, I have filtered out completely all the packets with dts or pts
== AV_NOPTS_VALUE.  As stated in comment, they pass into muxing fine if they're
at the very beginning of the stream, but they result in error when they're fed
to muxer in the middle of stream (e.g. just after the seek, exactly it happens
with Matroska input). I don't know what to do with them otherwise, but
it obviously results in loss of data - seems it can contain video keyframes.

I am worried a lot by finding stable general-use approaches to resolve
timestamp discontinuity caused by seeking, for all grades of accuracy.
Current patch proposes approach with worst possible accuracy, but with
applicability for general case and requiring no decoding or reencoding.

OFFTOP: Would anybody be interested if i prepare similarly a showcase of
accurate applying of ffmpeg filters in the middle of videostream, without
reencoding all the stream?

---8<---
---
 doc/examples/seeking_while_remuxing.c | 308 ++++++++++++++++++++++++++++++++++
 1 file changed, 308 insertions(+)
 create mode 100644 doc/examples/seeking_while_remuxing.c

diff --git a/doc/examples/seeking_while_remuxing.c b/doc/examples/seeking_while_remuxing.c
new file mode 100644
index 0000000..735cba7
--- /dev/null
+++ b/doc/examples/seeking_while_remuxing.c
@@ -0,0 +1,308 @@
+/*
+ * Copyright (c) 2014 Andrey Utkin
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+/**
+ * @file
+ * libavformat/libavcodec demuxing, muxing and seeking API example.
+ *
+ * Remux input file to output file up to 'seekfrom' time position, then seeks
+ * to 'seekto' position and continues remuxing. Seek is performed only once
+ * (won't loop).
+ * @example doc/examples/remux_and_seek.c
+ */
+
+#include <libavutil/timestamp.h>
+#include <libavformat/avformat.h>
+
+#define YOU_WANT_NO_ERRORS_ABOUT_NON_MONOTONIC_TIMESTAMPS
+
+static void log_packet(const AVFormatContext *fmt_ctx, const AVPacket *pkt, const char *tag)
+{
+    AVRational *time_base = &fmt_ctx->streams[pkt->stream_index]->time_base;
+
+    fprintf(stderr, "%s: pts:%s pts_time:%s dts:%s dts_time:%s duration:%s duration_time:%s stream_index:%d\n",
+            tag,
+            av_ts2str(pkt->pts), av_ts2timestr(pkt->pts, time_base),
+            av_ts2str(pkt->dts), av_ts2timestr(pkt->dts, time_base),
+            av_ts2str(pkt->duration), av_ts2timestr(pkt->duration, time_base),
+            pkt->stream_index);
+}
+
+int main(int argc, char **argv)
+{
+    AVFormatContext *ifmt_ctx = NULL, *ofmt_ctx = NULL;
+
+    int64_t shift = 0; // Output timestamp shift caused by seek.
+    // In microseconds, 10^-6 of second, which is AV_TIME_BASE_Q
+
+    int seek_done = 0;
+    const char *in_filename, *out_filename, *out_format_name;
+    int64_t seekfrom, seekto;
+    int ret;
+    unsigned int i;
+
+    if (argc != 6) {
+        fprintf(stderr, "Usage: %s <input file> <output file> "
+                "<output format, or empty for default> "
+                "<seekfrom: time offset to activate seek, microseconds> "
+                "<seekto: time offset to seek to, microseconds>\n", argv[0]);
+        fprintf(stderr, "Remuxes input file to output file up to 'seekfrom' "
+                "time position, then seeks to 'seekto' position and continues "
+                "remuxing. Seek is performed only once (won't loop).\n");
+        return 1;
+    }
+
+    in_filename = argv[1];
+    out_filename = argv[2];
+    out_format_name = argv[3];
+
+    ret = sscanf(argv[4], "%"PRId64, &seekfrom);
+    if (ret != 1) {
+        fprintf(stderr, "Invalid seekfrom %s\n", argv[4]);
+        return 1;
+    }
+
+    ret = sscanf(argv[5], "%"PRId64, &seekto);
+    if (ret != 1) {
+        fprintf(stderr, "Invalid seekto %s\n", argv[5]);
+        return 1;
+    }
+
+    // Initialize libavformat
+    av_register_all();
+    avformat_network_init();
+
+    // Open file, init input file context, read file's mediacontainer header.
+    // Some file and elementary streams information is available after this
+    if ((ret = avformat_open_input(&ifmt_ctx, in_filename, 0, 0)) < 0) {
+        fprintf(stderr, "Could not open input file '%s'", in_filename);
+        goto end;
+    }
+
+    // Reads some amount of file contents to get all information about elementary streams.
+    // This can be not necessary is some cases, but in general case, this is needed step.
+    if ((ret = avformat_find_stream_info(ifmt_ctx, 0)) < 0) {
+        fprintf(stderr, "Failed to retrieve input stream information");
+        goto end;
+    }
+
+    // Dump input file and its elementary streams properties to stderr
+    av_dump_format(ifmt_ctx, 0, in_filename, 0);
+
+    // Open output context, with specified mediacontainer type if given
+    ret = avformat_alloc_output_context2(&ofmt_ctx, NULL,
+            out_format_name[0] ? out_format_name : NULL, out_filename);
+    if (ret < 0) {
+        fprintf(stderr, "Failed to open output context by URL %s\n", out_filename);
+        goto end;
+    }
+
+    // Define for output file same elementary streams as in input file
+    for (i = 0; i < ifmt_ctx->nb_streams; i++) {
+        AVStream *in_stream = ifmt_ctx->streams[i];
+        AVStream *out_stream = avformat_new_stream(ofmt_ctx, in_stream->codec->codec);
+        if (!out_stream) {
+            fprintf(stderr, "Failed allocating elementary output stream\n");
+            ret = AVERROR_UNKNOWN;
+            goto end;
+        }
+
+        ret = avcodec_copy_context(out_stream->codec, in_stream->codec);
+        if (ret < 0) {
+            fprintf(stderr, "Failed to copy elementary stream properties\n");
+            goto end;
+        }
+        if (ofmt_ctx->oformat->flags & AVFMT_GLOBALHEADER)
+            out_stream->codec->flags |= CODEC_FLAG_GLOBAL_HEADER;
+    }
+
+    av_dump_format(ofmt_ctx, 0, out_filename, 1);
+
+    // Initializes actual output context on protocol, output device or file level
+    ret = avio_open(&ofmt_ctx->pb, out_filename, AVIO_FLAG_WRITE);
+    if (ret < 0) {
+        fprintf(stderr, "Could not open output to '%s'", out_filename);
+        goto end;
+    }
+
+    // Last step of output initialization. Mediacontainer format "driver" is
+    // initialized. This generally leads to writing header data to output file.
+    ret = avformat_write_header(ofmt_ctx, NULL);
+    if (ret < 0) {
+        fprintf(stderr, "Error occurred when opening output file\n");
+        goto end;
+    }
+
+    // Copy input elementary streams to output at packed frames level.
+    // This process is known as remuxing (remultiplexing). It consists of
+    // demultiplexing (demuxing) streams from input and multiplexing (muxing)
+    // to output.
+    // No image/sound decoding takes place in this case.
+    while (1) {
+        AVPacket pkt;
+        AVStream *in_stream, *out_stream;
+        int64_t current_dts_mcs;
+
+        memset(&pkt, 0, sizeof(pkt));
+        ret = av_read_frame(ifmt_ctx, &pkt);
+        if (ret < 0)
+            break;
+
+        log_packet(ifmt_ctx, &pkt, "in");
+
+        if (pkt.dts == AV_NOPTS_VALUE || pkt.pts == AV_NOPTS_VALUE) {
+            // TODO Decode to figure out timestamps? Anyway, decoding is out of
+            // scope of this example currently.
+            //
+            // Such packets happen to be keyframes in Matroska.
+            // So dropping them adds up to lost data.
+            // When they're remuxed at the beginning of stream, it's OK, but
+            // av_interleaved_write_frame() raises non-monotonity error when
+            // they're pushed after a seek (i.e. when there were
+            // correctly-timestamped packets before)
+            printf("Discarding packet not having timestamps\n");
+            av_free_packet(&pkt);
+            continue;
+        }
+
+        in_stream  = ifmt_ctx->streams[pkt.stream_index];
+        out_stream = ofmt_ctx->streams[pkt.stream_index];
+
+        current_dts_mcs = av_rescale_q (pkt.dts, in_stream->time_base, AV_TIME_BASE_Q);
+
+        // Check if it's time to seek
+        if (!seek_done
+            && current_dts_mcs >= seekfrom) {
+            av_free_packet(&pkt);
+            printf("Seeking. Last read packet is discarded\n");
+            ret = av_seek_frame(ifmt_ctx, -1, seekto, 0);
+            if (ret) {
+                fprintf(stderr, "Seeking failed\n");
+                break;
+            }
+            seek_done = 1;
+            shift = seekfrom - seekto;
+            continue;
+        }
+
+#ifdef YOU_WANT_NO_ERRORS_ABOUT_NON_MONOTONIC_TIMESTAMPS
+        if (seek_done && current_dts_mcs < seekto) {
+            printf("Discarding packet having timestamp lower than needed\n");
+            av_free_packet(&pkt);
+            continue;
+            // Citing official ffmpeg docs:
+            // "Note the in most formats it is not possible to seek exactly, so
+            // ffmpeg will seek to the closest seek point before (given)
+            // position."
+            //
+            // To seek exactly (accurately), without possibly losing keyframes
+            // or introducing desync, and still being safe against timestamps
+            // monotonity problem, you must reencode part of video after
+            // seeking point, to make key frame where you want to start
+            // playback after seeking. You may also want to fill possible time
+            // gaps with silence (for audio) or duplicating frames (for video)
+            // to support technically poor playback clients (e.g. Flash
+            // plugin), and this is also achievable with reencoding.  This is
+            // simpler if you are already in process of transcoding, not in
+            // remuxing.
+            //
+            // Note. In case of necessity to fill audio gaps (e.g. Flash
+            // player) and avoid even smallest desync, and if audio output
+            // encoding does not allow variable frame length, in certain
+            // situation you may have to go in reencoding mode until the end of
+            // stream, because you may have timestamp shift not equal to
+            // multiple of audio frame duration.
+            //
+            // Note 2. Audio packets dts and pts do not always accurately
+            // represent reality. Ultimately accurate accounting of audio data
+            // duration and time offset can be achieved through accounting
+            // number of audio samples transmitted.
+            //
+            // The most important and practical part:
+            //
+            // In this example, for simplicity, we allow possibility of losing
+            // keyframe (which can in some cases lead to scattered image for
+            // some period after seeking). Desync is not introduced, because we
+            // shift all elementary streams timestamps by same offset, although
+            // see Note 2.
+            //
+            // Another technically similar approach is just to push packets
+            // carelessly into muxer after seeking (with any rough shift
+            // calculation), ignoring AVERROR(EINVAL) return values from it.
+            // Well, you'd better ignore such errors anyway, because you can
+            // have non-monotonic DTS already in input stream, this indeed
+            // happens on some files. Although you may track timestamps
+            // yourself to filter out unordered packets or maybe even reorder
+            // them.
+            //
+            // This chosen approach is generally bad, because failing to
+            // transmit correctly a video keyframe breaks the playback of up to
+            // several seconds of video. But it is simple and does not require
+            // anything except basic remuxing.
+        }
+#endif
+
+        // We rescale timestamps because time units used in input and output
+        // file formats may differ
+        // I.e. for MPEG TS, time unit is 1/90000, for FLV it is 1/1000, etc.
+        pkt.pts = av_rescale_q(pkt.pts, in_stream->time_base, out_stream->time_base)
+            + av_rescale_q(shift, AV_TIME_BASE_Q, out_stream->time_base);
+        pkt.dts = av_rescale_q(pkt.dts, in_stream->time_base, out_stream->time_base)
+            + av_rescale_q(shift, AV_TIME_BASE_Q, out_stream->time_base);
+
+        pkt.duration = av_rescale_q(pkt.duration, in_stream->time_base, out_stream->time_base);
+        pkt.pos = -1;
+        log_packet(ofmt_ctx, &pkt, "out");
+
+        ret = av_interleaved_write_frame(ofmt_ctx, &pkt);
+        if (ret < 0) {
+            if (ret == AVERROR(EINVAL)) {
+                printf("Muxing error, presumably of non-monotonic DTS, can be ignored\n");
+            } else {
+                fprintf(stderr, "Error muxing packet\n");
+                break;
+            }
+        }
+        av_free_packet(&pkt);
+    }
+
+    // Deinitialize format driver, finalizes output file/stream appropriately.
+    av_write_trailer(ofmt_ctx);
+
+end:
+    // Closes input format context and releases related memory
+    avformat_close_input(&ifmt_ctx);
+
+    // Close output file/connection context
+    if (ofmt_ctx)
+        avio_close(ofmt_ctx->pb);
+
+    // Close format context of output file
+    avformat_free_context(ofmt_ctx);
+
+    // Check if we got here because of error, if so - decode its meaning and report
+    if (ret < 0 && ret != AVERROR_EOF) {
+        fprintf(stderr, "Error occurred: %s\n", av_err2str(ret));
+        return 1;
+    }
+    return 0;
+}
-- 
1.8.1.5



More information about the ffmpeg-devel mailing list