[FFmpeg-devel] [PATCH 0/2] Implement SMPTE 2038 output support over Decklink SDI

Mon Apr 24 17:11:25 EEST 2023

Hello Marton,

Thanks for reviewing.  Comments inline:

On Sun, Apr 23, 2023 at 2:43 PM Marton Balint <cus at passwd.hu> wrote:
> In general, queueing packets in specific components should be avoided if
> possible. Muxed packets are normally ordered by DTS and stream id, generic
> code ensures that. If you want something other than that, then I think
> the perferred way of doing it is by providing a custom interleave
> function. (e.g. to ensure you get data packets before video even if data
> stream has a higher stream ID.)

To be clear, using a queue was not first choice.  It's the result of
trying different approaches, and I'm open to constructive suggestions
on alternatives.

While what you're are saying is correct "in general", there are some
really important reasons why it doesn't work in this case.  Permit me
to explain...

By default, the behavior of the mux interleaver is to wait until there
is at least one packet available for each stream before writing to the
output module (in this case decklink).  However data formats such as
SMPTE ST2038 are considered to be "sparse" as there isn't necessarily
a continuous stream of packets like with video and audio (there may be
many seconds between packets, or no packets at all).  As a result you
can't wait for a packet to be available on all streams since on some
streams it will simply wait continuously until hitting the
max_interleave_delta, at which point it will burst out everything in
the queue.  This would cause stalls and/or stuttering playback on the
decklink output.

To accommodate these sparse streams we added code to mux.c to not wait
for 2038 packets.  A side-effect of that though is that packets will
be sent through as soon as they hit the mux, which in most cases will
be significantly ahead of the video (potentially hundreds of
milliseconds).  This can easily be seen experimentally by adding an
av_log() line to ff_decklink_write_packet(), which will show in many
cases the PTS values of the data frames being sent 20+ frames before
the corresponding video.

The queue is there because the data packets and video frames arrive in
separate calls to write_packet(), and they need to be combined to
ensure they are inserted into the same video frame.  Stashing the data
packets seemed like a reasonable approach, and a queue seemed like a
good choice as a data structure since there can be multiple data
packets for a video frame and we might receive data packets for
multiple video frames before the corresponding video frames arrived.

The notion you mentioned that the data packets might arrive after the
video frames is a valid concern hypothetically.  In practice it hasn't
been an issue, as the data packets tend to arrive long before the
video.  It was not a motivation for using a queue.  If a data packet
did arrive after the video (due to the DTS and stream ID ordering you
mentioned), the implementation would insert it on the next video frame
and it would effectively be one frame late.  I was willing to accept
this edge case given it doesn't actually happen in practice.

> If you are only using the queue to store multiple data packets for a
> single frame then one way to avoid it is to parse them as soon as they
> arrive via the KLV library. If you insist on queueing them (maybe because
> not every packet will be parased by the KLV lib), then I'd rather see you
> use avpriv_packet_list_*() functions, and not a custom decklink
> implementation.

Passing them off to libklvanc doesn't actually change the queueing
problem.  The libklvanc library doesn't actually output the VANC
packets but rather just converts them into the byte sequences that
then need to be embedded into the video frames.  I guess I could queue
the output of libklvanc rather than the original AVPackets, but that
doesn't actually solve any of the problems described above, and
actually makes things more complicated since the AVPackets contain all
the timing data and the VANC byte blobs would need to queue not just
the data but also the output timing, VANC line number and horizontal
position within the VANC region.

Regarding the use of avpriv_packet_list() as opposed to
avpacket_queue_*, I used the avpacket_queue functions for consistency
with the decklink capture module where it is used today.  Also,
avpacket_queue is threadsafe while avpriv_packet_list.*() is not.
While the threadsafeness is not critical for the VANC case, I have
subsequent patches for audio where it is important, and I figured it
would more consistent to use the same queue mechanism within decklink
for all three (capture, audio output, and vanc output).

That said, I wouldn't specifically object to converting to the
avpriv_packet_list functions since thread-safeness isn't really a
requirement for this particular case.  It's probably worth noting
though that I extended the avpacket_queue method to allow me to peek
at the first packet in the queue (which avpriv_packet_list doesn't
support today).  Hence converting to avpriv_packet_list would require
an equivalent addition to be accepted upstream.

Devin

--
Devin Heitmueller, Senior Software Engineer
LTN Global Communications
o: +1 (301) 363-1001
w: https://ltnglobal.com  e: devin.heitmueller at ltnglobal.com