[FFmpeg-devel] Enhancement layers in FFmpeg

Mon Aug 1 17:26:26 EEST 2022

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Niklas Haas
> Sent: Monday, August 1, 2022 3:59 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] Enhancement layers in FFmpeg
> 
> On Mon, 01 Aug 2022 13:17:12 +0000 Soft Works <softworkz at hotmail.com>
> wrote:
> > From my (rather limited) angle of view, my thoughts are these:
> >
> > When decoding these kinds of sources, a user would typically not
> only
> > want to do the processing in hardware but the decoding as well.
> >
> > I think we cannot realistically expect that any of the hw decoders
> > will add support for this in the near future. As we cannot modify
> > those ourselves, the only way to do such processing would be a
> > hardware filter. I think, the EL data would need to be attached
> > to frames as some kind of side data (or similar) and get uploaded
> > by the hw filter (internally) which will apply the EL data.
> 
> If both the BL and the EL are separate fully coded bitstreams, then
> could we instantiate two independent HW decoder instances to decode
> the
> respective planes?

Sure. TBH, I didn't know that the EL data is encoded in the same
way. I wonder how those frames would look like when viewed standalone..

> > IMO it would be desirable when both of these things would/could be
> > done in a single operation.
> 
> For Dolby Vision we have little choice in the matter. The EL
> application
> needs to happen *after* chroma interpolation, PQ linearization, IPT
> matrix application, and poly/MMR reshaping. These are currently all
> on-GPU processes in the relevant video output codebases.
> 
> So for Dolby Vision that locks us into the design where we merely
> expose
> the EL planes as part of the AVFrame and leave it to be the user's
> problem 

If ffmpeg cannot apply it, then I don't think there will be many users 
being able to make some use of it :-)

> (or the problem of filters like `vf_libplacebo`).

Something I always wanted to ask you: is it even thinkable to port
this to a CPU implementation (with reasonable performance)?

> An open question (for me) is whether or not this is required for
> SVC-H264, SHVC, AV1-SVC etc.
> 
> > As long as it doesn't have its own format, its own start time,
> > resolution, duration, color space/transfer/primaries, etc..
> > I wouldn’t say that it's a frame.
> 
> Indeed, it seems like the EL data is tied directly to the BL data for
> the formats I have seen so far. So they are just like extra planes on
> the AVFrame - and indeed, we could simply use extra data pointers
> here
> (we already have room for 8).

Hendrik's idea makes sense to me when this is not just some
data but real frames, decoded with a regular decoder.
Yet I don't know anything about the other enhancement cases either.

Best regards,
softworkz