[FFmpeg-devel] [PATCH] avfilter: add signalstats filter
Stefano Sabatini
stefasab at gmail.com
Wed Jun 4 10:36:42 CEST 2014
On date Monday 2014-06-02 23:50:27 +0200, Clément Bœsch encoded:
> Signed-off-by: Mark Heath <mjpeg0 at silicontrip.net>
> Signed-off-by: Dave Rice <dericed at yahoo.com>
> Signed-off-by: Clément Bœsch <u at pkh.me>
> ---
> TODO: bump lavfi minor
> ---
> Changelog | 1 +
> doc/filters.texi | 168 +++++++++++++++
> libavfilter/Makefile | 1 +
> libavfilter/allfilters.c | 1 +
> libavfilter/vf_signalstats.c | 479 +++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 650 insertions(+)
> create mode 100644 libavfilter/vf_signalstats.c
>
> diff --git a/Changelog b/Changelog
> index 3d416c4..9c366ff 100644
> --- a/Changelog
> +++ b/Changelog
> @@ -26,6 +26,7 @@ version <next>:
> - native Opus decoder
> - display matrix export and rotation api
> - WebVTT encoder
> +- signalstats filter
>
>
> version 2.2:
> diff --git a/doc/filters.texi b/doc/filters.texi
> index e004c44..d30827a 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -7532,6 +7532,174 @@ Swap the second and third planes of the input:
> ffmpeg -i INPUT -vf shuffleplanes=0:2:1:3 OUTPUT
> @end example
>
> + at section signalstats
> +Evaluate various visual metrics that assist in determining issues associated
> +with the digitization of analog video media.
> +
> +By default the filter will log these metadata values:
> +
> + at table @option
> + at item YMIN
> +Display the minimal Y value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item YLOW
> +Display the Y value at the 10% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item YAVG
> +Display the average Y value within the input frame. Expressed in range of
> +[0-255].
> +
> + at item YHIGH
> +Display the Y value at the 90% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item YMAX
> +Display the maximum Y value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item UMIN
> +Display the minimal U value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item ULOW
> +Display the U value at the 10% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item UAVG
> +Display the average U value within the input frame. Expressed in range of
> +[0-255].
> +
> + at item UHIGH
> +Display the U value at the 90% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item UMAX
> +Display the maximum U value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item VMIN
> +Display the minimal V value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item VLOW
> +Display the V value at the 10% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item VAVG
> +Display the average V value within the input frame. Expressed in range of
> +[0-255].
> +
> + at item VHIGH
> +Display the V value at the 90% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item VMAX
> +Display the maximum V value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item SATMIN
> +Display the minimal saturation value contained within the input frame.
> +Expressed in range of [0-~181.02].
> +
> + at item SATLOW
> +Display the saturation value at the 10% percentile within the input frame.
> +Expressed in range of [0-~181.02].
> +
> + at item SATAVG
> +Display the average saturation value within the input frame. Expressed in range
> +of [0-~181.02].
> +
> + at item SATHIGH
> +Display the saturation value at the 90% percentile within the input frame.
> +Expressed in range of [0-~181.02].
> +
> + at item SATMAX
> +Display the maximum saturation value contained within the input frame.
> +Expressed in range of [0-~181.02].
> +
> + at item HUEMED
> +Display the median value for hue within the input frame. Expressed in range of
> +[0-360].
> +
> + at item HUEAVG
> +Display the average value for hue within the input frame. Expressed in range of
> +[0-360].
> +
> + at item YDIF
> +Display a quantification of the visual change on the Y plane between the input
> +frame and the previous input frame.
> +
> + at item UDIF
> +Display a quantification of the visual change on the U plane between the input
> +frame and the previous input frame.
> +
> + at item VDIF
> +Display a quantification of the visual change on the V plane between the input
> +frame and the previous input frame.
> + at end table
"a quantification" is a bit vague.
> +
> +The filter accepts the following options:
> +
> + at table @option
> + at item stat
> +Specify an additional form of image analysis. It accepts the following values:
> +
Compression nit:
@item stat
@item out
@option{stat} specify an additional form of image analysis.
@option{out} output video with the specified type of pixel highlighted.
Both options accept the following values:
> + at table @samp
> + at item tout
> +Identify @var{temporal outliers} pixels. A @var{temporal outlier} is a pixel
> +unlike the neighboring pixels of the same field. Examples of temporal outliers
> +include the results of video dropouts, head clogs, or tape tracking issues.
> +
> + at item vrep
> +Identify @var{vertical line repetition}. Vertical line repetition includes
> +similar rows of pixels within a frame. In born-digital video vertical line
> +repetition is common, but this pattern is uncommon in video digitized from an
> +analog source. When it occurs in video that results from the digitization of an
> +analog source it can indicate concealment from a dropout compensator.
> +
> + at item brng
> +Identify pixels that fall outside of legal broadcast range.
> + at end table
> +
> + at item out
> +Output video with the specified type of pixel highlighted. It accepts the
> +following values:
> +
> + at table @samp
> + at item tout
> + at item vrep
> + at item brng
> + at end table
> +
> + at item color, c
> +Set the highlight color for the @option{out} option. The default color is
> +yellow.
> + at end table
> +
> + at subsection Examples
> +
> + at itemize
> + at item
> +Output data of various video metrics:
> + at example
> +ffprobe -f lavfi movie=example.mov,signalstats="stat=tout+vrep+rang" -show_frames
what is "rang"?
> + at end example
> +
> + at item
> +Output specific data about the minimum and maximum values of the Y plane per frame:
> + at example
> +ffprobe -f lavfi movie=example.mov,signalstats -show_entries frame_tags=lavfi.values.YMAX,lavfi.values.YMIN
Do we have a namespace resolution scheme? What about lavfi.signalstats.VAL?
> + at end example
> +
> + at item
> +Playback video while highlighting pixels that are outside of broadcast range in red.
> + at example
> +ffplay example.mov -vf values="out=brng:color=red"
> + at end example
> + at end itemize
> +
> @anchor{smartblur}
> @section smartblur
>
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index f981dfa..142c06e 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -174,6 +174,7 @@ OBJS-$(CONFIG_SETSAR_FILTER) += vf_aspect.o
> OBJS-$(CONFIG_SETTB_FILTER) += settb.o
> OBJS-$(CONFIG_SHOWINFO_FILTER) += vf_showinfo.o
> OBJS-$(CONFIG_SHUFFLEPLANES_FILTER) += vf_shuffleplanes.o
> +OBJS-$(CONFIG_SIGNALSTATS_FILTER) += vf_signalstats.o
> OBJS-$(CONFIG_SMARTBLUR_FILTER) += vf_smartblur.o
> OBJS-$(CONFIG_SPLIT_FILTER) += split.o
> OBJS-$(CONFIG_SPP_FILTER) += vf_spp.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 22d643d..4b9db9e 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -191,6 +191,7 @@ void avfilter_register_all(void)
> REGISTER_FILTER(SETTB, settb, vf);
> REGISTER_FILTER(SHOWINFO, showinfo, vf);
> REGISTER_FILTER(SHUFFLEPLANES, shuffleplanes, vf);
> + REGISTER_FILTER(SIGNALSTATS, signalstats, vf);
> REGISTER_FILTER(SMARTBLUR, smartblur, vf);
> REGISTER_FILTER(SPLIT, split, vf);
> REGISTER_FILTER(SPP, spp, vf);
> diff --git a/libavfilter/vf_signalstats.c b/libavfilter/vf_signalstats.c
> new file mode 100644
> index 0000000..e06bfe1
> --- /dev/null
> +++ b/libavfilter/vf_signalstats.c
> @@ -0,0 +1,479 @@
> +/*
> + * Copyright (c) 2010 Mark Heath mjpeg0 @ silicontrip dot org
> + * Copyright (c) 2014 Clément Bœsch
> + * Copyright (c) 2014 Dave Rice @dericed
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include "libavutil/opt.h"
> +#include "libavutil/pixdesc.h"
> +#include "libavcodec/mathops.h"
> +#include "internal.h"
> +
> +enum FilterMode {
> + FILTER_NONE = -1,
> + FILTER_TOUT,
> + FILTER_VREP,
> + FILTER_BRNG,
> + FILT_NUMB
> +};
> +
> +typedef struct {
> + const AVClass *class;
> + int chromah;
> + int chromaw;
> + int hsub;
> + int vsub;
> + int fs;
> + int cfs;
nit: add comments
> + enum FilterMode outfilter;
> + int filters;
> + AVFrame *frame_prev;
> + char *vrep_line;
> + uint8_t rgba_color[4];
> + int yuv_color[3];
> +} SignalstatsContext;
> +
> +#define OFFSET(x) offsetof(SignalstatsContext, x)
> +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
> +
> +static const AVOption signalstats_options[] = {
> + {"stat", "set statistics filters", OFFSET(filters), AV_OPT_TYPE_FLAGS, {.i64=0}, 0, INT_MAX, FLAGS, "filters"},
> + {"tout", "analyze pixels for temporal outliers", 0, AV_OPT_TYPE_CONST, {.i64=1<<FILTER_TOUT}, 0, 0, FLAGS, "filters"},
> + {"vrep", "analyze video lines for vertical line repitition", 0, AV_OPT_TYPE_CONST, {.i64=1<<FILTER_VREP}, 0, 0, FLAGS, "filters"},
> + {"brng", "analyze for pixels outside of broadcast range", 0, AV_OPT_TYPE_CONST, {.i64=1<<FILTER_BRNG}, 0, 0, FLAGS, "filters"},
> + {"out", "set video filter", OFFSET(outfilter), AV_OPT_TYPE_INT, {.i64=FILTER_NONE}, -1, FILT_NUMB-1, FLAGS, "out"},
> + {"tout", "highlight pixels that depict temporal outliers", 0, AV_OPT_TYPE_CONST, {.i64=FILTER_TOUT}, 0, 0, FLAGS, "out"},
> + {"vrep", "highlight video lines that depict vertical line repitition", 0, AV_OPT_TYPE_CONST, {.i64=FILTER_VREP}, 0, 0, FLAGS, "out"},
> + {"brng", "highlight pixels that are outside of broadcast range", 0, AV_OPT_TYPE_CONST, {.i64=FILTER_BRNG}, 0, 0, FLAGS, "out"},
> + {"c", "set highlight color", OFFSET(rgba_color), AV_OPT_TYPE_COLOR, {.str="yellow"}, .flags=FLAGS},
> + {"color", "set highlight color", OFFSET(rgba_color), AV_OPT_TYPE_COLOR, {.str="yellow"}, .flags=FLAGS},
> + {NULL}
> +};
> +
> +AVFILTER_DEFINE_CLASS(signalstats);
> +
> +static av_cold int init(AVFilterContext *ctx)
> +{
> + uint8_t r, g, b;
> + SignalstatsContext *s = ctx->priv;
> +
> + if (s->outfilter != FILTER_NONE)
> + s->filters |= 1 << s->outfilter;
> +
> + r = s->rgba_color[0];
> + g = s->rgba_color[1];
> + b = s->rgba_color[2];
> + s->yuv_color[0] = (( 66*r + 129*g + 25*b + (1<<7)) >> 8) + 16;
> + s->yuv_color[1] = ((-38*r + -74*g + 112*b + (1<<7)) >> 8) + 128;
> + s->yuv_color[2] = ((112*r + -94*g + -18*b + (1<<7)) >> 8) + 128;
Unrelated note: having an API for this transform would be useful in
several places.
> + return 0;
> +}
> +
> +static av_cold void uninit(AVFilterContext *ctx)
> +{
> + SignalstatsContext *s = ctx->priv;
> + av_frame_free(&s->frame_prev);
> + av_freep(&s->vrep_line);
> +}
> +
> +static int query_formats(AVFilterContext *ctx)
> +{
> + // TODO: add more
> + enum AVPixelFormat pix_fmts[] = {
> + AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV420P, AV_PIX_FMT_YUV411P,
> + AV_PIX_FMT_NONE
> + };
> +
> + ff_set_common_formats(ctx, ff_make_format_list(pix_fmts));
> + return 0;
> +}
> +
> +static int config_props(AVFilterLink *outlink)
> +{
> + AVFilterContext *ctx = outlink->src;
> + SignalstatsContext *s = ctx->priv;
> + AVFilterLink *inlink = outlink->src->inputs[0];
> + const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(outlink->format);
> + s->hsub = desc->log2_chroma_w;
> + s->vsub = desc->log2_chroma_h;
> +
> + outlink->w = inlink->w;
> + outlink->h = inlink->h;
> +
> + s->chromaw = FF_CEIL_RSHIFT(inlink->w, s->hsub);
> + s->chromah = FF_CEIL_RSHIFT(inlink->h, s->vsub);
> +
> + s->fs = inlink->w * inlink->h;
> + s->cfs = s->chromaw * s->chromah;
> +
> + if (s->filters & 1<<FILTER_VREP) {
> + s->vrep_line = av_malloc(inlink->h * sizeof(*s->vrep_line));
> + if (!s->vrep_line)
> + return AVERROR(ENOMEM);
> + }
> +
> + return 0;
> +}
> +
> +static void burn_frame(SignalstatsContext *s, AVFrame *f, int x, int y)
> +{
> + const int chromax = x >> s->hsub;
> + const int chromay = y >> s->vsub;
> + f->data[0][y * f->linesize[0] + x] = s->yuv_color[0];
> + f->data[1][chromay * f->linesize[1] + chromax] = s->yuv_color[1];
> + f->data[2][chromay * f->linesize[2] + chromax] = s->yuv_color[2];
> +}
> +
> +static int filter_brng(SignalstatsContext *s, const AVFrame *in, AVFrame *out, int y, int w, int h)
> +{
> + int x, score = 0;
> + const int yc = y >> s->vsub;
> + const uint8_t *pluma = &in->data[0][y * in->linesize[0]];
> + const uint8_t *pchromau = &in->data[1][yc * in->linesize[1]];
> + const uint8_t *pchromav = &in->data[2][yc * in->linesize[2]];
> +
> + for (x = 0; x < w; x++) {
> + const int xc = x >> s->hsub;
> + const int luma = pluma[x];
> + const int chromau = pchromau[xc];
> + const int chromav = pchromav[xc];
> + const int filt = luma < 16 || luma > 235 ||
> + chromau < 16 || chromau > 240 ||
> + chromav < 16 || chromav > 240;
> + score += filt;
> + if (out && filt)
> + burn_frame(s, out, x, y);
> + }
> + return score;
> +}
> +
> +static int filter_tout_outlier(uint8_t x, uint8_t y, uint8_t z)
> +{
> + return ((abs(x - y) + abs (z - y)) / 2) - abs(z - x) > 4; // make 4 configurable?
> +}
> +
> +static int filter_tout(SignalstatsContext *s, const AVFrame *in, AVFrame *out, int y, int w, int h)
> +{
> + const uint8_t *p = in->data[0];
> + int lw = in->linesize[0];
> + int x, score = 0, filt;
> +
> + if (y - 1 < 0 || y + 1 >= h)
> + return 0;
> +
> + // detect two pixels above and below (to eliminate interlace artefacts)
> + // should check that video format is infact interlace.
typo: interlaced?
> +#define FILTER(i, j) \
> +filter_tout_outlier(p[(y-j) * lw + x + i], \
> + p[ y * lw + x + i], \
> + p[(y+j) * lw + x + i])
> +
> +#define FILTER3(j) (FILTER(-1, j) && FILTER(0, j) && FILTER(1, j))
> +
> + if (y - 2 >= 0 && y + 2 < h) {
> + for (x = 1; x < w - 1; x++) {
> + filt = FILTER3(2) && FILTER3(1);
> + score += filt;
> + if (filt && out)
> + burn_frame(s, out, x, y);
> + }
> + } else {
> + for (x = 1; x < w - 1; x++) {
> + filt = FILTER3(1);
> + score += filt;
> + if (filt && out)
> + burn_frame(s, out, x, y);
> + }
> + }
> + return score;
> +}
> +
> +#define VREP_START 4
> +
> +static void filter_init_vrep(SignalstatsContext *s, const AVFrame *p, int w, int h)
> +{
> + int i, y;
> + int lw = p->linesize[0];
> +
> + for (y = VREP_START; y < h; y++) {
> + int totdiff = 0;
> + int y2lw = (y - VREP_START) * lw;
> + int ylw = y * lw;
> +
> + for (i = 0; i < w; i++)
> + totdiff += abs(p->data[0][y2lw + i] - p->data[0][ylw + i]);
> +
> + /* this value should be definable */
> + s->vrep_line[y] = totdiff < w;
> + }
> +}
> +
> +static int filter_vrep(SignalstatsContext *s, const AVFrame *in, AVFrame *out, int y, int w, int h)
> +{
> + int x, score = 0;
> +
> + if (y < VREP_START)
> + return 0;
> +
> + for (x = 0; x < w; x++) {
> + if (s->vrep_line[y]) {
> + score++;
> + if (out)
> + burn_frame(s, out, x, y);
> + }
> + }
> + return score;
> +}
> +
> +static const struct {
> + const char *name;
> + void (*init)(SignalstatsContext *s, const AVFrame *p, int w, int h);
> + int (*process)(SignalstatsContext *s, const AVFrame *in, AVFrame *out, int y, int w, int h);
> +} filters_def[] = {
> + {"TOUT", NULL, filter_tout},
> + {"VREP", filter_init_vrep, filter_vrep},
> + {"BRNG", NULL, filter_brng},
> + {NULL}
> +};
> +
> +#define DEPTH 256
> +
> +static int filter_frame(AVFilterLink *link, AVFrame *in)
> +{
> + SignalstatsContext *s = link->dst->priv;
> + AVFilterLink *outlink = link->dst->outputs[0];
> + AVFrame *out = in;
> + int i, j;
> + int w = 0, cw = 0, // in
> + pw = 0, cpw = 0; // prev
> + int yuv, yuvu, yuvv;
> + int fil;
> + char metabuf[128];
> + unsigned int histy[DEPTH] = {0},
> + histu[DEPTH] = {0},
> + histv[DEPTH] = {0},
> + histhue[360] = {0},
> + histsat[DEPTH] = {0}; // limited to 8 bit data.
> + int miny = -1, minu = -1, minv = -1;
> + int maxy = -1, maxu = -1, maxv = -1;
> + int lowy = -1, lowu = -1, lowv = -1;
> + int highy = -1, highu = -1, highv = -1;
> + int minsat = -1, maxsat = -1, lowsat = -1, highsat = -1;
> + int lowp, highp, clowp, chighp;
> + int accy, accu, accv;
> + int accsat, acchue=0;
nit+++: acchue = 0;
> + int medhue, maxhue;
> + int toty = 0, totu = 0, totv = 0, totsat=0;
> + int tothue = 0;
> + int dify = 0, difu = 0, difv = 0;
> +
> + int filtot[FILT_NUMB] = {0};
> + AVFrame *prev;
> +
> + if (!s->frame_prev)
> + s->frame_prev = av_frame_clone(in);
> +
> + prev = s->frame_prev;
> +
> + if (s->outfilter != FILTER_NONE)
> + out = av_frame_clone(in);
> +
> + for (fil = 0; fil < FILT_NUMB; fil ++)
> + if ((s->filters & 1<<fil) && filters_def[fil].init)
> + filters_def[fil].init(s, in, link->w, link->h);
> +
> + // Calculate luma histogram and difference with previous frame or field.
> + for (j = 0; j < link->h; j++) {
> + for (i = 0; i < link->w; i++) {
> + yuv = in->data[0][w + i];
> + histy[yuv]++;
> + dify += abs(in->data[0][w + i] - prev->data[0][pw + i]);
> + }
> + w += in->linesize[0];
> + pw += prev->linesize[0];
> + }
> +
> + // Calculate chroma histogram and difference with previous frame or field.
> + for (j = 0; j < s->chromah; j++) {
> + for (i = 0; i < s->chromaw; i++) {
> + int sat, hue;
> +
> + yuvu = in->data[1][cw+i];
> + yuvv = in->data[2][cw+i];
> + histu[yuvu]++;
> + difu += abs(in->data[1][cw+i] - prev->data[1][cpw+i]);
> + histv[yuvv]++;
> + difv += abs(in->data[2][cw+i] - prev->data[2][cpw+i]);
> +
> + // int or round?
> + sat = ff_sqrt((yuvu-128) * (yuvu-128) + (yuvv-128) * (yuvv-128));
> + histsat[sat]++;
> + hue = floor((180 / M_PI) * atan2f(yuvu-128, yuvv-128) + 180);
> + histhue[hue]++;
> + }
> + cw += in->linesize[1];
> + cpw += prev->linesize[1];
> + }
> +
> + for (j = 0; j < link->h; j++) {
> + for (fil = 0; fil < FILT_NUMB; fil ++) {
> + if (s->filters & 1<<fil) {
> + AVFrame *dbg = out != in && s->outfilter == fil ? out : NULL;
> + filtot[fil] += filters_def[fil].process(s, in, dbg, j, link->w, link->h);
> + }
> + }
> + }
> +
> + // find low / high based on histogram percentile
> + // these only need to be calculated once.
> +
> + lowp = s->fs * 10 / 100;
> + highp = s->fs * 90 / 100;
> + clowp = s->cfs * 10 / 100;
> + chighp = s->cfs * 90 / 100;
I wonder if we should make the percentile value parametric (for
example setting the percentile margin M and then computing 1-M and M
values).
[...]
--
FFmpeg = Fostering and Fabulous Multimedia Practical Elected Gorilla
More information about the ffmpeg-devel
mailing list