[FFmpeg-devel] [RFC]] swscale modernization proposal

Tue Jul 2 16:27:00 EEST 2024

On Sat, 22 Jun 2024 15:13:34 +0200 Niklas Haas <ffmpeg at haasn.xyz> wrote:
> Finally, avscale_* should ultimately also support hardware frames
> directly, in which case it will dispatch to some equivalent of
> scale_vulkan/vaapi/cuda or possibly even libplacebo. (But I will defer
> this to a future milestone)

How do people feel about being able to control the "backend" directly
from the <avscale.h> interface?

For example,

struct AVScaleContext {
    int backend;
    ...
}

enum AVScaleBackend {
    AV_SCALE_AUTO = 0, // select first backend that passes avscale_test_*
    AV_SCALE_NATIVE,   // use new scaling pipeline (when implemented)
    AV_SCALE_SWSCALE,  // use existing SwsContext code 1:1, to-be-deprecated
    AV_SCALE_ZIMG,
    AV_SCALE_LIBPLACEBO,
    AV_SCALE_CUDA,
    AV_SCALE_VAAPI,
    ...
};

int avscale_test_format(enum AVScaleBackend backend,
                        enum AVPixelFormat format,
                        int output);

int avscale_test_colorspace(...);
...

Or alternatively, a perhaps more extensible design:

struct AVScaleBackend {
    int (*test_format)(enum AVPixelFormat format, int output);
    int (*test_colorspace)(...);
    ...
    /* somehow expose per-backend options here? */
};

My over-arching goal here is to be able to unify vf_scale_*,
vf_libplacebo, vf_zscale etc. into a single common API, and in doing so,
effectively also have a suitable vf_scale_* variant auto-inserted into
filter chains.

I'm also thinking about supporting per-backend options in this case,
since some backends (notably libplacebo) expose about five billion
tunable parameters that go beyond the basic "quality preset, scaling
mode, dither mode" that I plan on exposing at the top level.

Open questions:

1. Is this a good idea, or too confusing / complex to be worth the gain?
   Specifically, I am worried about confusion arising due to differences
   in behavior, and implemented options, between all of the above.

   That said, I think there is a big win to be had from unifying all of
   the different scaling and/or conversion filters we have in e.g.
   libavfilter, as well as making it trivial for users of this API to
   try using e.g. GPU scaling instead of CPU scaling.

2. How should we expose per-backend options? Probably the safest bet is
   to wrap the relevant options inside AVOptions, as vf_zscale
   / vf_libplacebo / vf_scale etc. currently do.

   Alternatively, it could be nice if we could give users direct access
   to the underlying API configuration structs (pl_render_params,
   zimg_graph_builder_params, SwsContext, etc.).