[FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode

Wed Nov 23 04:43:16 EET 2022

[AMD Official Use Only - General]

Hi Mark,

Just got the ffmpeg email, please see my answer below in [rdong].

Thanks,
Ruijing

-----Original Message-----
From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Mark Thompson
Sent: Tuesday, November 22, 2022 6:34 PM
To: ffmpeg-devel at ffmpeg.org
Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode

On 22/11/2022 20:59, Mark Thompson wrote:
> On 22/11/2022 20:26, Mark Thompson wrote:
>> On 22/11/2022 19:18, Dong, Ruijing wrote:
>>> [AMD Official Use Only - General]
>>>
>>> Hi Mark,
>>>
>>> Sorry for being late to reply to you.
>>>
>>> Your understanding is correct, and I have sent a new patch [v4] for
>>> addressing the current issue and to use driver quirk mechanism to specify only AMD VAAPI driver has this behavior, then this could be more specific.
>>>
>>> For AMD hardware, it allocates GPU memory internally for the DPB management, the output is always the final one with or without applied film-grain.
>>
>> I don't see why this requires you to write the output to the wrong surface.  Why not write it to the correct one instead?
>
> Indeed, this seems to be a trivial fix in Mesa: <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fmerge_requests%2F19938&data=05%7C01%7Cruijing.dong%40amd.com%7C0896eeffcf674092aa8408dacce228f3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047570363839738%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3Be3k9TQdRjGJv2KYNsssFjl3ORaNXzdSBqei%2BgYSBg%3D&reserved=0>.
>
> It would be helpful if someone with suitable hardware could test that.

This was too naive, the Mesa driver doesn't make this easy.

It is only set up to write to a single surface, which is the one provided to vaBeginPicture().  However, VAAPI does not work that way - it wants you to write to both the pre-grain and the post-grain surfaces, where the pre-grain surface is the primary target and gets passed the vaBeginPicture() and the post-grain surface is supplied in the parameters.

So that's the first problem: the render target which is given as the pre-grain surface needs to be replaced by post-grain surface if we want to only write a single surface.

[rdong] Well, the render surface targets to output the displayable surface, and film grain has many ways to be carried on,  one way is to let application doing next step, the other way is the
decoder who can directly output the final applied grain result.  The VAAPI interface should accommodate as much as possible hardware instead of limiting the only way
for the implementation. AMD decoder for the performance optimization (tiled formats and swizzle modes) and other considerations, it will need to manage the reference frames internally, and there is no point to implement a new logic to output both pre-grain and post-grain surfaces. If grain_apply is set, then the decoder will only output the applied grain result.

Is that enough?  Well, no.  The Mesa driver is also messing with the reference frames.

The VAAPI model is that the pre-grain surfaces are passed back into the driver on subsequent frames when they are used as reference frames.  However, the Mesa driver has hidden the pre-grain surface internally and only written the post-grain surface.

Therefore, when writing a post-grain output, it magically associates with the target surface information about the pre-grain surface which was written internally at the same time.  Then, when you later give it that surface as a reference frame it ignores the actual content of the frame and looks at the associated data to find what to use internally as the reference.

That's the second problem: if the post-grain surface were actually the render target then the magic internal reference gets associated with that, and when we pass the real reference frame (the pre-grain surface) in later then it won't recognise it because it never wrote to that surface.

[rdong] In my understanding the target surface should be the final output, and the reason AMD decoder cannot output the reference frames I have already described above, in fact, the reference frame buffer pointers are used as reference to indicate how the reference picture buffer will managed internally, please understand there are some concept differences there.

How should it be fixed, then?

The best way would be to stop hiding the internal information about reference frames: if the real reference frames were visible in VAAPI then everything would just work and none of the magic internal references would be needed.

[rdong] Well, it has many difficulties to map the well designed AMD decoder to the VAAPI interface, however we could not expect everything is perfect.

If we suppose that this can't be done (maybe it is hidden behind opaque firmware which the naughty users buying the products are not allowed to see), then Mesa needs two changes:

1.  Write the output to the post-grain surface rather than the pre-grain surface.  This is nontrivial because it isn't the surface passed to vaBeginPicture(), but given the API there isn't really any way around it.

2.  Attach the magic internal reference to the pre-grain surface, /even though it wasn't the one written to/.  This makes the reference frames work, since the pre-grain surfaces will be the ones passed back in later frames.

[rdong] What is the point of passing back and forth the so called pre-grain surfaces as reference frames?

Alternatively: make new API in libva somehow.  Probably wants an attribute which indicates that vaBeginPicture() would want the post-grain surface and then ignore the surfaces in the picture parameters?  Unclear exactly how this should be specified, but whatever it is it needs to be very clear about how the references would work.

Hacking FFmpeg to use the API differently based on matching substrings in the vendor name does not seem like a good approach here, given that future Mesa decode implementations (which could be AMD or could be other hardware) may well be more sensible.  It would also doom other VAAPI users, since you can only really send this sort of hack to a few big projects like FFmpeg.

[rdong] I agree with you on this idea, however as I checked the current substring is in fact only used by AMD driver, if there were some better ideas, we would like to consider them as well.

Thanks,

- Mark
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel at ffmpeg.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-devel&data=05%7C01%7Cruijing.dong%40amd.com%7C0896eeffcf674092aa8408dacce228f3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047570363839738%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OSXDuJzGJCUpgkzQlV8duiClz8Eb8nVLc2MG01o%2Fs3Q%3D&reserved=0

To unsubscribe, visit link above, or email ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 19860 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20221123/5a9979fa/attachment.bin>