[FFmpeg-devel] Development of a CUDA accelerated variant of the libav vf_tonemap
Felix LeClair
felix.leclair123 at hotmail.com
Tue Jan 12 23:13:01 EET 2021
That's great! Any way for me to pull that branch or otherwise
contribute?
Have been using FFmpeg for a few years now, so hopping to be able to
give back.
On Tue, Jan 12, 2021 at 5:55 am, Lynne <dev at lynne.ee> wrote:
> Jan 11, 2021, 23:27 by felix.leclair123 at hotmail.com
> <mailto:felix.leclair123 at hotmail.com>:
>
>> Hi guys and gals, first post on this mailing list, apologies for
>> any formatting/stylistic snafus
>>
>> TLDR; we currently have tone mapping filters (typically used to map
>> content from a 10bit HDR source to an 8bit SDR output) that are done
>> on CPU with Zscale from Zlib, or hardware implementations using
>> VAAPI or OpenCL. Having a version implemented in CUDA would round
>> out the main HWaccels types.
>>
>> Context:
>> I'm a computer engineering student up in Canada with an interest
>> in high efficiency distributed processing. As a personal project I'm
>> trying to build a cluster of Nvidia Jetson Nano's to be able to
>> handle a few dozen streams (mix of SD, HD, FHD, UHD, 4kHDR) at once
>> while drawing south of 100W at peak. These little devices can do
>> anywhere from 1 to 9 streams of content at a time depending on
>> resolution/framerate in hardware in any mix of HEVC or H.264, so 3
>> of them should get me most of the way to where I want to go (this
>> would be a 30W package capable of ~12 2160p30 at 10 bit -> 1080p30 8bit
>> streams).
>>
>> The issue is that, 4 little arm64 cores are just not going to be
>> able to tonemap using Zscale in real time, even with the encoder and
>> decoders sharing memory with the CPU (so no PCIe memcopy penalty).
>> On the other hand, the built in GPU and the relative simplicity of
>> most tone mapping algorithms (say hable) should make quick work of
>> this. Unfortunately (or fortunately for me to learn with?) there
>> isn't a CUDA version of the filter.
>>
>> Question/guidance:
>> I've read through the doc on how to write filters, as well as
>> looking at the other cuda filters currently in the source and have a
>> general idea of where I'm going, but haven't been able to fully nail
>> down how to access frames from hwupload_cuda passed to
>> vf_tonemap_cuda.c which in turn passes that frame to
>> vf_tonemap_cuda.cu for processing. I have a repo with everything
>> I've been pulling together for my project, but the piece of interest
>> is under */cuda_filter/ in the source tree.
>> <<https://github.com/Camofelix/Jetson_ffmpeg_trancode_cluster/>>
>>
>> Would anyone mind helping me out with how to architect this?
>>
>
> The tonemap filter is just a (very old by now) copy of libplacebo's
> tonemapping.
> No one has bothered to keep it in sync.
> I'm working on a libplacebo wrapper currently, so once that's merged
> there
> will be up to date hardware tonemapping.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org <mailto:ffmpeg-devel at ffmpeg.org>
> <https://ffmpeg.org/mailman/listinfo/ffmpeg-devel>
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org
> <mailto:ffmpeg-devel-request at ffmpeg.org> with subject "unsubscribe".
More information about the ffmpeg-devel
mailing list