[FFmpeg-devel] [PATCH] avfilter: add vf_overlay_cuda
Alex
3.14pi at ukr.net
Wed Apr 1 17:00:22 EEST 2020
Hi!My GPU is GTX 1080Ti.
Trying Your command but same error result.
I tested on windows build downloaded from https://ffmpeg.zeranoe.com/builds/
Stream mapping:
Stream #0:0 (h264) -> overlay_cuda:main
Stream #1:0 (png) -> format
overlay_cuda -> Stream #0:0 (h264_nvenc)
Press [q] to stop, [?] for help
[h264 @ 00000231eee7ce40] NVDEC capabilities:
[h264 @ 00000231eee7ce40] format supported: yes, max_mb_count: 65536
[h264 @ 00000231eee7ce40] min_width: 48, max_width: 4096
[h264 @ 00000231eee7ce40] min_height: 16, max_height: 4096
[h264 @ 00000231eee7ce40] Reinit context to 1280x720, pix_fmt: cuda
[graph 0 input from stream 1:0 @ 0000023182422180] w:1894 h:302 pixfmt:rgba tb:1/25 fr:25/1 sar:11811/11811
[graph 0 input from stream 0:0 @ 000002318bbe1540] w:1280 h:720 pixfmt:cuda tb:1/24000 fr:24000/1001 sar:1/1
[auto_scaler_0 @ 000002318bbe55c0] w:iw h:ih flags:'bilinear' interl:0
[Parsed_format_0 @ 00000231825e4bc0] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 1:0' and the filter 'Parsed_format_0'
[auto_scaler_0 @ 000002318bbe55c0] w:1894 h:302 fmt:rgba sar:11811/11811 -> w:1894 h:302 fmt:nv12 sar:1/1 flags:0x2
[overlay_cuda @ 0000023182798140] cu->cuModuleLoadData(&ctx->cu_module, vf_overlay_cuda_ptx) failed -> CUDA_ERROR_INVALID_IMAGE: device kernel image is invalid
[Parsed_overlay_cuda_2 @ 0000023182431d40] Failed to configure output pad on Parsed_overlay_cuda_2
Error reinitializing filters!
Failed to inject frame into filter network: Generic error in an external library
Error while processing the decoded data for stream #0:0
[AVIOContext @ 0000023182437840] Statistics: 0 seeks, 0 writeouts
[AVIOContext @ 00000231eee87b80] Statistics: 409657 bytes read, 2 seeks
[AVIOContext @ 000002318248e700] Statistics: 67602 bytes read, 0 seeks
Conversion failed!
--- Original message ---
From: "Dennis Mungai" <dmngaie at gmail.com>
Date: 1 April 2020, 16:51:16
On Wed, 1 Apr 2020 at 16:43, Alex <3.14pi at ukr.net> wrote:
> Hi!Is it working? I try everything but constantly get error from
> overlay_cuda:
>
>
> ffmpeg -y -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid
> -c:v h264_cuvid -resize 1920x1080 -i 720p.mp4 -i watermark.png
> -filter_complex
> "[1:v]format=nv12,hwupload[img];[0:v][img]overlay_cuda=x=50:y=800[out]"
> -map [out] -c:v h264_nvenc -b:v 6M -an -preset fast -y
> out_nvenc_overlay.mp4
> ...
> ffmpeg version git-2020-04-01-afa5e38
> ...
> [h264_cuvid @ 000001dd1b356d00] CUVID capabilities for h264_cuvid:
> [h264_cuvid @ 000001dd1b356d00] 8 bit: supported: 1, min_width: 48,
> max_width: 4096, min_height: 16, max_height: 4096
> [h264_cuvid @ 000001dd1b356d00] 10 bit: supported: 0, min_width: 0,
> max_width: 0, min_height: 0, max_height: 0
> [h264_cuvid @ 000001dd1b356d00] 12 bit: supported: 0, min_width: 0,
> max_width: 0, min_height: 0, max_height: 0
> Stream mapping:
> Stream #0:0 (h264_cuvid) -> overlay_cuda:main
> Stream #1:0 (png) -> format
> overlay_cuda -> Stream #0:0 (h264_nvenc)
> Press [q] to stop, [?] for help
> [h264_cuvid @ 000001dd1b356d00] Formats: Original: cuda | HW: cuda | SW:
> nv12
> [graph 0 input from stream 1:0 @ 000001dd2e84a100] w:1894 h:302
> pixfmt:rgba tb:1/25 fr:25/1 sar:11811/11811
> [graph 0 input from stream 0:0 @ 000001dd2e84ae00] w:1920 h:1080
> pixfmt:cuda tb:1/24000 fr:24000/1001 sar:1/1
> [auto_scaler_0 @ 000001dd2ebf4cc0] w:iw h:ih flags:'bilinear' interl:0
> [Parsed_format_0 @ 000001dd2e849780] auto-inserting filter 'auto_scaler_0'
> between the filter 'graph 0 input from stream 1:0' and the filter
> 'Parsed_format_0'
> [auto_scaler_0 @ 000001dd2ebf4cc0] w:1894 h:302 fmt:rgba sar:11811/11811
> -> w:1894 h:302 fmt:nv12 sar:1/1 flags:0x2
> [overlay_cuda @ 000001dd2ebc87c0] cu->cuModuleLoadData(&ctx->cu_module,
> vf_overlay_cuda_ptx) failed -> CUDA_ERROR_INVALID_IMAGE: device kernel
> image is invalid
> [Parsed_overlay_cuda_2 @ 000001dd2e84b6c0] Failed to configure output pad
> on Parsed_overlay_cuda_2
> Error reinitializing filters!
> Failed to inject frame into filter network: Generic error in an external
> library
> Error while processing the decoded data for stream #0:0
> ...
>
>
>
> --- Original message ---
> From: "Yaroslav Pogrebnyak" <yyyaroslav at gmail.com>
> Date: 18 March 2020, 09:29:15
>
> This patch adds 'vf_overlay_cuda' filter.
> It draws one picture on top of another on CUDA GPU.
> For the end-user, it's similar to 'vf_overlay_opencl' and other overlay
> filters.
>
> This filter would be especially useful for building video processing
> pipelines that execute fully on the CUDA GPU. For example, the following
> pipeline would be possible: decode -> scale -> overlay -> encode, without
> copying frames between CPU and GPU in between.
>
> Technical details.
>
> Supported sw input formats are NV12 and YUV420P for main input, and NV12,
> YUV420P and YUVA420P for overlay input.
> Main and overlay sw formats should match (i.e, overlaying YUVA420P on NV12
> is not implemented).
> All pixel format conversions are needed to be done with 'format' or
> 'scale_npp' filters before 'overlay_cuda'.
>
> It was needed to slightly modify 'hwcontext_cuda.c' to allow overlays with
> alpha channel:
> - Allow AV_PIX_FMT_YUVA420P to enable hwuploading frames with alpha
> channel to GPU.
> - Do not shift Height of 4rd plane (alpha) when uploading to GPU.
>
> Examples.
>
> - Overlay picture on top of video (main: YUVJ420P->NV12, overlay: NV12)
> $ ffmpeg -y -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel
> cuvid \
> -c:v h264_cuvid -i main.mp4 \
> -i ~/overlay.jpg \
> -filter_complex "[1:v]format=nv12, hwupload[overlay],
> [0:v][overlay]overlay_cuda=x=0:y=0:shortest=false" \
> -an -c:v h264_nvenc -b:v 5M output.mp4
>
> - Overlay one video on top of another (main: NV12, overlay: NV12)
> $ ffmpeg -y \
> -hwaccel cuvid -c:v h264_cuvid -i main.mp4 \
> -hwaccel cuvid -c:v h264_cuvid -i overlay.mp4 \
> -filter_complex "[1:v]scale_npp=512:-1[o],
> [v:0][o]overlay_cuda=x=100:y=100:shortest=true" \
> -an -c:v h264_nvenc -b:v 5M output.mp4
>
> - Overlay picture with alpha channel on top of video (main: NV12->YUV420P,
> overlay: RGBA->YUVA420P)
> $ ffmpeg -y \
> -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid \
> -c:v h264_cuvid -i ~/main.mp4 \
> -i ~/overlay.png \
> -filter_complex "[1:v]format=yuva420p, hwupload[o],
> [v:0]scale_npp=format=yuv420p[m],
> [m][o]overlay_cuda=x=0:y=0:shortest=false" \
> -an -c:v h264_nvenc -b:v 5M output.mp4
>
> Patch attached.
>
> P.S. This is my first patch, I would be grateful for any feedback to know
> if I'm doing things correctly or not.
> Thanks!
>
>
> Signed-off-by: Yaroslav Pogrebnyak <yyyaroslav at gmail.com>
> ---
> configure | 2 +
> libavfilter/Makefile | 1 +
> libavfilter/allfilters.c | 1 +
> libavfilter/vf_overlay_cuda.c | 451 +++++++++++++++++++++++++++++++++
> libavfilter/vf_overlay_cuda.cu | 54 ++++
> libavutil/hwcontext_cuda.c | 3 +-
> 6 files changed, 511 insertions(+), 1 deletion(-)
> create mode 100644 libavfilter/vf_overlay_cuda.c
> create mode 100644 libavfilter/vf_overlay_cuda.cu
>
>
>
>
How does the NVDEC path work out?
Try this:
ffmpeg -y -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuda
--hwaccel_output_format cuda -i 720p.mp4 -i watermark.png -filter_complex
"[1:v]format=nv12,hwupload[img];[0:v][img]overlay_cuda=x=50:y=800[out]"
-map [out] -c:v h264_nvenc -b:v 6M -an -preset fast -y
out_nvenc_overlay.mp4
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel at ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
More information about the ffmpeg-devel
mailing list