[FFmpeg-devel] [PATCH 3/3] nlmeans_vulkan: parallelize workgroup invocations

Lynne dev at lynne.ee
Wed Oct 11 06:34:57 EEST 2023


Oct 7, 2023, 17:08 by dev at lynne.ee:

> Removes the clever subgroup parallel prefix computation,
> and instead just computes the prefix inline.
> Cuts down the number of dispatches by a huge amount.
>
> Provides a ~12x speedup (2.5fps to 30fps on a 7900XTX,
> 2.1fps to 24fps on an Ada).
>
> Patch attached.
>

Going to push the patchset a bit later today.


More information about the ffmpeg-devel mailing list