[FFmpeg-devel] [PATCH 3/3] nlmeans_vulkan: parallelize workgroup invocations

Lynne dev at lynne.ee
Sat Oct 7 18:07:52 EEST 2023


Removes the clever subgroup parallel prefix computation,
and instead just computes the prefix inline.
Cuts down the number of dispatches by a huge amount.

Provides a ~12x speedup (2.5fps to 30fps on a 7900XTX,
2.1fps to 24fps on an Ada).

Patch attached.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-nlmeans_vulkan-parallelize-workgroup-invocations.patch
Type: text/x-diff
Size: 38467 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20231007/12dc0439/attachment.patch>


More information about the ffmpeg-devel mailing list