[FFmpeg-devel] Enhancement layers in FFmpeg

Mon Aug 1 14:24:52 EEST 2022

Hey,

We need to think about possible ways to implement reasonably-transparent
support for enhancement layers in FFmpeg. (SVC, Dolby Vision, ...).
There are more open questions than answers here.

>From what I can tell, these are basically separate bitstreams that carry
some amount of auxiliary information needed to reconstruct the
high-quality bitstream. That is, they are not independent, but need to
be merged with the original bitstream somehow.

How do we architecturally fit this into FFmpeg? Do we define a new codec
ID for each (common/relevant) combination of base codec and enhancement
layer, e.g. HEVC+DoVi, H.264+SVC, ..., or do we transparently handle it
for the base codec ID and control it via a flag? Do the enhancement
layer packets already make their way to the codec, and if not, how do we
ensure that this is the case?

Can the decoder itself recursively initialize a sub-decoder for the
second bitstream? And if so, does the decoder apply the actual
transformation, or does it merely attach the EL data to the AVFrame
somehow in a way that can be used by further filters or end users?

(What about the case of Dolby Vision, which iirc requires handling the
DoVi RPU metadata before the EL can be applied? What about instances
where the user wants the DoVi/EL application to happen on GPU, e.g. via
libplacebo in mpv/vlc?)

How does this metadata need to be attached? A second AVFrame reference
inside the AVFrame? Raw data in a big side data struct?