[FFmpeg-devel] [PATCH] lavu: add AVVideoHint API
Stefano Sabatini
stefasab at gmail.com
Mon Jul 24 02:27:27 EEST 2023
On date Tuesday 2023-07-18 00:19:40 +0200, Stefano Sabatini wrote:
> On date Monday 2023-07-10 12:51:25 +0000, Carotti, Elias wrote:
> > On Mon, 2023-07-10 at 08:13 +0000, Carotti, Elias wrote:
> > >
> >
> > > AVVideoHint is a bad name for something like this.
> > > Could you borrow some wording from graphics and call it
> > > AVVideoDamagedHint or maybe AVVideoChangedAreaHint or a combination
> > > of both?
> > > I'd prefer the former, damage is standard language in graphics
> > > circles about what has changed since the last frame.
> > >
> > > Hi,
> > > I have no strong objections on this. Personally I also like the
> > > AVVideoDamagedHint name best, my only concern is that it is strictly
> > > related to the current use/implementation
> > > (it's true right now that's the only kind of hint) while it may turn
> > > out to be a bad naming decision should other forms of hinting for the
> > > encoder be added in the future.
> > > That said, I am fine with the change too.
>
> AVVideoDamagedHint would be too specific (what if we want to add
> "hinting" for different things?).
>
> > > Elias
> > >
> >
> > I added a type to the AVVideoRect struct. This should solve the naming
> > issue above while preserving the possibility to extend this to
> > different hinting types.
> > These are the only changes to Anton's version.
> > Best
> > Elias
> >
> >
> >
> >
> >
> >
> >
> > NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico
> >
> >
>
> > From 8ef4f97410a6b78df048b71d9921a763da6255b3 Mon Sep 17 00:00:00 2001
> > From: Elias Carotti <eliascrt _at_ amazon _dot_ it>
> > Date: Mon, 10 Jul 2023 14:34:53 +0200
> > Subject: [PATCH] Add side data type to provide hint to the video encoders
> > about unchanged portions of each frame.
> >
> > Signed-off-by: Anton Khirnov <anton at khirnov.net>
> > ---
> > doc/APIchanges | 3 ++
> > libavutil/Makefile | 2 +
> > libavutil/frame.h | 10 ++++
> > libavutil/version.h | 2 +-
> > libavutil/video_hint.c | 82 +++++++++++++++++++++++++++++
> > libavutil/video_hint.h | 117 +++++++++++++++++++++++++++++++++++++++++
> > 6 files changed, 215 insertions(+), 1 deletion(-)
> > create mode 100644 libavutil/video_hint.c
> > create mode 100644 libavutil/video_hint.h
> >
> > diff --git a/doc/APIchanges b/doc/APIchanges
> > index 27f835cfce..0cda51fdee 100644
> > --- a/doc/APIchanges
> > +++ b/doc/APIchanges
> > @@ -2,6 +2,9 @@ The last version increases of all libraries were on 2023-02-09
> >
> > API changes, most recent first:
> >
> > +2023-07-xx - xxxxxxxxxx - lavu 58.15.100 - video_hint.h
> > + Add AVVideoHint API.
> > +
> > 2023-07-05 - xxxxxxxxxx - lavu 58.14.100 - random_seed.h
> > Add av_random_bytes()
> >
> > diff --git a/libavutil/Makefile b/libavutil/Makefile
> > index bd9c6f9e32..7828c94dc5 100644
> > --- a/libavutil/Makefile
> > +++ b/libavutil/Makefile
> > @@ -91,6 +91,7 @@ HEADERS = adler32.h \
> > tea.h \
> > tx.h \
> > film_grain_params.h \
> > + video_hint.h
> >
> > ARCH_HEADERS = bswap.h \
> > intmath.h \
> > @@ -181,6 +182,7 @@ OBJS = adler32.o \
> > uuid.o \
> > version.o \
> > video_enc_params.o \
> > + video_hint.o \
> > film_grain_params.o \
> >
> >
> > diff --git a/libavutil/frame.h b/libavutil/frame.h
> > index a491315f25..c0c1b23db7 100644
> > --- a/libavutil/frame.h
> > +++ b/libavutil/frame.h
> > @@ -214,6 +214,16 @@ enum AVFrameSideDataType {
> > * Ambient viewing environment metadata, as defined by H.274.
> > */
> > AV_FRAME_DATA_AMBIENT_VIEWING_ENVIRONMENT,
> > +
> > + /**
> > + * Provide encoder-specific hinting information about changed/unchanged
> > + * portions of a frame. It can be used to pass information about which
> > + * macroblocks can be skipped because they didn't change from the
> > + * corresponding ones in the previous frame. This could be useful for
> > + * applications which know this information in advance to speed up
> > + * encoding.
> > + */
> > + AV_FRAME_DATA_VIDEO_HINT,
> > };
> >
> > enum AVActiveFormatDescription {
> > diff --git a/libavutil/version.h b/libavutil/version.h
> > index 24af520e08..9e798b0e3f 100644
> > --- a/libavutil/version.h
> > +++ b/libavutil/version.h
> > @@ -79,7 +79,7 @@
> > */
> >
> > #define LIBAVUTIL_VERSION_MAJOR 58
> > -#define LIBAVUTIL_VERSION_MINOR 14
> > +#define LIBAVUTIL_VERSION_MINOR 15
> > #define LIBAVUTIL_VERSION_MICRO 100
> >
> > #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
> > diff --git a/libavutil/video_hint.c b/libavutil/video_hint.c
> > new file mode 100644
> > index 0000000000..5730ed6cfb
> > --- /dev/null
> > +++ b/libavutil/video_hint.c
> > @@ -0,0 +1,82 @@
> > +/*
> > + * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> > + */
> > +
> > +#include <string.h>
> > +
> > +#include "avstring.h"
> > +#include "frame.h"
> > +#include "macros.h"
> > +#include "mem.h"
> > +#include "video_hint.h"
> > +
> > +AVVideoHint *av_video_hint_alloc(size_t nb_rects,
> > + size_t *out_size)
> > +{
> > + struct TestStruct {
> > + AVVideoHint hint;
> > + AVVideoRect rect;
> > + };
> > + const size_t rect_offset = offsetof(struct TestStruct, rect);
> > + size_t size = rect_offset;
> > + AVVideoHint *hint;
> > +
> > + *out_size = 0;
> > + if (nb_rects > (SIZE_MAX - size) / sizeof(AVVideoRect))
> > + return NULL;
> > + size += sizeof(AVVideoRect) * nb_rects;
> > +
> > + hint = av_mallocz(size);
> > + if (!hint)
> > + return NULL;
> > +
> > + hint->nb_rects = nb_rects;
> > + hint->rect_offset = rect_offset;
> > + hint->rect_size = sizeof(AVVideoRect);
> > +
> > + *out_size = size;
> > +
> > + return hint;
> > +}
> > +
> > +AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
> > + size_t nb_rects)
> > +{
> > + AVVideoHint *hint;
> > + AVBufferRef *buf;
> > + size_t size = 0;
> > +
> > + hint = av_video_hint_alloc(nb_rects, &size);
> > + if (!hint)
> > + return NULL;
> > +
> > + buf = av_buffer_create((uint8_t *)hint, size, NULL, NULL, 0);
> > + if (!buf) {
> > + av_freep(&hint);
> > + return NULL;
> > + }
> > +
> > + if (!av_frame_new_side_data_from_buf(frame, AV_FRAME_DATA_VIDEO_HINT, buf)) {
> > + av_buffer_unref(&buf);
> > + return NULL;
> > + }
> > +
> > + return hint;
> > +}
> > +
> > diff --git a/libavutil/video_hint.h b/libavutil/video_hint.h
> > new file mode 100644
> > index 0000000000..14bfe20aa6
> > --- /dev/null
> > +++ b/libavutil/video_hint.h
> > @@ -0,0 +1,117 @@
> > +/**
> > + * Copyright 2023 Elias Carotti <eliascrt at amazon dot it>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> > + */
> > +
> > +#ifndef AVUTIL_VIDEO_HINT_H
> > +#define AVUTIL_VIDEO_HINT_H
> > +
> > +#include <stddef.h>
> > +#include <stdint.h>
> > +#include "libavutil/avassert.h"
> > +#include "libavutil/frame.h"
> > +
>
> > +/**
> > + * The type of the hinting information provided by an AVVideoRect.
> > + * Currently only damage information, i.e., changed/unchanged portions of the frame.
> > + **/
>
> > +typedef enum AVVideoRectType {
> > + AV_VIDEO_RECT_DAMAGE,
>
> AV_VIDEO_RECT_TYPE_DAMAGE for consistency
>
> > +} AVVideoRectType;
>
> > +
> > +typedef struct AVVideoRect {
> > + uint32_t x, y;
> > + uint32_t width, height;
> > + AVVideoRectType type;
> > +} AVVideoRect;
> > +
> > +typedef enum AVVideoHintType {
> > + /* rectangled delimit the constant areas (unchanged), default is changed */
>
> rectangles?
>
> > + AV_VIDEO_HINT_CONSTANT,
>
> AV_VIDEO_HINT_TYPE_...
>
> > +
> > + /* rectangled delimit the constant areas (changed), default is not changed */
>
> rectangles?
>
> > + AV_VIDEO_HINT_CHANGED,
> > +} AVVideoHintType;
> >
> > +typedef struct AVVideoHint {
> > + /**
> > + * Number of AVVideoRect present.
> > + *
> > + * May be 0, in which case no per-rectangle information is present. In this
> > + * case the values of rect_offset / rect_size are unspecified and should
> > + * not be accessed.
> > + */
> > + size_t nb_rects;
> > +
> > + /**
> > + * Offset in bytes from the beginning of this structure at which the array
> > + * of AVVideoRect starts.
> > + */
> > + size_t rect_offset;
> > +
> > + /**
> > + * Size in bytes of AVVideoRect.
> > + */
> > + size_t rect_size;
> > +
>
> > + AVVideoHintType type;
>
> sorry to nitpick, but if we have this information in the single rect
> should not we use that instead?
>
> Basically I see two options:
> 1. generic VideoHint, storing single hint type and rects
>
> for example we might have type=changed, and the rects storing only the
> changed rects
>
> If we want to extend this then we need to add a new type. The problem
> might be if we want to store more than one VideoHint in the side data
> (it it possible to store more than one AVVideoHint, each one with a
> different type, as side data?). Suppose for example we want to add a
> new hint type, e.g. ROI, then we can add a new AVHideoHint with
> type=roi - but I don't know if we can have several frame hint side
> data (each one with different individual types).
>
> 2. generic VideoHint, storing rects each one containing the hint type
>
> for example we might have a list of rects, each one with
> type=changed|constant|roi. This would allow one to store different
> kind of hints in the same AVVideoHint structure, but at the same time
> would enable inconsistent data (for example we might have changed and
> unchanged rects in the same list of rects)
>
> In each case, storing the type in the generic AVVideHint and in each
> rect might lead to inconsistent states (e.g. you might have generic
> type=damage, and have type=roi and type=changed in the rects), in this
> case having the type in both the general structure and in each rects
> seems redundant and leads to possible data inconsistencies.
>
> ...
>
> I would rather go to solution 1, but I'm not sure if having more than
> one hint in the side data (with different subtypes) is possible and
> wanted. If this is the case, probably the best solution would be to
> store more than one AVVideoHint (each one with an individual type and
> list of rects) in the single side data object.
Elaborating on this.
1. Ideally we might want to extend the av_frame_get_side_data API to
be able to store more than one entry per side data, by adding a unique
label to the side data, but this requires probably more effort and
extends the scope even further.
2. Alternatively, we might extend the hint API like this:
AVVideoHint *av_video_hint_create_side_data(AVFrame *frame,
size_t nb_rects);
AVVideoHint *av_video_hint_extend_side_data(AVFrame *frame,
size_t nb_rects);
by marking the continuation on the previous hint (but this would
complicate deserialization as you need then to iterate to get all the
side data).
Or we could make a variadic function like this:
AVVideoHint *av_video_hint_create_side_data(AVFrame *frame, size_t nb_rects, ...);
Alternatively, we might store in AVVideoHint more than one hint, but
this somehow conflict with the general design.
...
As a low effort approach, I would keep the initial design of a single
hint type per side-data (dropping the possibility of having more than
one hint type in the same side data), which should be fixed
generically by implementing 1.
More information about the ffmpeg-devel
mailing list