[FFmpeg-devel] [PATCH] avfilter: add signalstats filter

Wed Jun 4 10:36:42 CEST 2014

On date Monday 2014-06-02 23:50:27 +0200, Clément Bœsch encoded:
> Signed-off-by: Mark Heath <mjpeg0 at silicontrip.net>
> Signed-off-by: Dave Rice <dericed at yahoo.com>
> Signed-off-by: Clément Bœsch <u at pkh.me>
> ---
> TODO: bump lavfi minor
> ---
>  Changelog                    |   1 +
>  doc/filters.texi             | 168 +++++++++++++++
>  libavfilter/Makefile         |   1 +
>  libavfilter/allfilters.c     |   1 +
>  libavfilter/vf_signalstats.c | 479 +++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 650 insertions(+)
>  create mode 100644 libavfilter/vf_signalstats.c
> 
> diff --git a/Changelog b/Changelog
> index 3d416c4..9c366ff 100644
> --- a/Changelog
> +++ b/Changelog
> @@ -26,6 +26,7 @@ version <next>:
>  - native Opus decoder
>  - display matrix export and rotation api
>  - WebVTT encoder
> +- signalstats filter
>  
>  
>  version 2.2:
> diff --git a/doc/filters.texi b/doc/filters.texi
> index e004c44..d30827a 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -7532,6 +7532,174 @@ Swap the second and third planes of the input:
>  ffmpeg -i INPUT -vf shuffleplanes=0:2:1:3 OUTPUT
>  @end example
>  
> + at section signalstats
> +Evaluate various visual metrics that assist in determining issues associated
> +with the digitization of analog video media.
> +
> +By default the filter will log these metadata values:
> +
> + at table @option
> + at item YMIN
> +Display the minimal Y value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item YLOW
> +Display the Y value at the 10% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item YAVG
> +Display the average Y value within the input frame. Expressed in range of
> +[0-255].
> +
> + at item YHIGH
> +Display the Y value at the 90% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item YMAX
> +Display the maximum Y value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item UMIN
> +Display the minimal U value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item ULOW
> +Display the U value at the 10% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item UAVG
> +Display the average U value within the input frame. Expressed in range of
> +[0-255].
> +
> + at item UHIGH
> +Display the U value at the 90% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item UMAX
> +Display the maximum U value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item VMIN
> +Display the minimal V value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item VLOW
> +Display the V value at the 10% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item VAVG
> +Display the average V value within the input frame. Expressed in range of
> +[0-255].
> +
> + at item VHIGH
> +Display the V value at the 90% percentile within the input frame. Expressed in
> +range of [0-255].
> +
> + at item VMAX
> +Display the maximum V value contained within the input frame. Expressed in
> +range of [0-255].
> +
> + at item SATMIN
> +Display the minimal saturation value contained within the input frame.
> +Expressed in range of [0-~181.02].
> +
> + at item SATLOW
> +Display the saturation value at the 10% percentile within the input frame.
> +Expressed in range of [0-~181.02].
> +
> + at item SATAVG
> +Display the average saturation value within the input frame. Expressed in range
> +of [0-~181.02].
> +
> + at item SATHIGH
> +Display the saturation value at the 90% percentile within the input frame.
> +Expressed in range of [0-~181.02].
> +
> + at item SATMAX
> +Display the maximum saturation value contained within the input frame.
> +Expressed in range of [0-~181.02].
> +
> + at item HUEMED
> +Display the median value for hue within the input frame. Expressed in range of
> +[0-360].
> +
> + at item HUEAVG
> +Display the average value for hue within the input frame. Expressed in range of
> +[0-360].
> +

> + at item YDIF
> +Display a quantification of the visual change on the Y plane between the input
> +frame and the previous input frame.
> +
> + at item UDIF
> +Display a quantification of the visual change on the U plane between the input
> +frame and the previous input frame.
> +
> + at item VDIF
> +Display a quantification of the visual change on the V plane between the input
> +frame and the previous input frame.
> + at end table

"a quantification" is a bit vague.

> +
> +The filter accepts the following options:
> +

> + at table @option
> + at item stat
> +Specify an additional form of image analysis. It accepts the following values:
> +

Compression nit:

@item stat
@item out

@option{stat} specify an additional form of image analysis.
@option{out} output video with the specified type of pixel highlighted.

Both options accept the following values:

> + at table @samp
> + at item tout
> +Identify @var{temporal outliers} pixels. A @var{temporal outlier} is a pixel
> +unlike the neighboring pixels of the same field. Examples of temporal outliers
> +include the results of video dropouts, head clogs, or tape tracking issues.
> +
> + at item vrep
> +Identify @var{vertical line repetition}. Vertical line repetition includes
> +similar rows of pixels within a frame. In born-digital video vertical line
> +repetition is common, but this pattern is uncommon in video digitized from an
> +analog source. When it occurs in video that results from the digitization of an
> +analog source it can indicate concealment from a dropout compensator.
> +
> + at item brng
> +Identify pixels that fall outside of legal broadcast range.
> + at end table
> +
> + at item out
> +Output video with the specified type of pixel highlighted. It accepts the
> +following values:
> +
> + at table @samp
> + at item tout
> + at item vrep
> + at item brng
> + at end table
> +

> + at item color, c
> +Set the highlight color for the @option{out} option. The default color is
> +yellow.
> + at end table
> +
> + at subsection Examples
> +
> + at itemize
> + at item
> +Output data of various video metrics:

> + at example
> +ffprobe -f lavfi movie=example.mov,signalstats="stat=tout+vrep+rang" -show_frames

what is "rang"?

> + at end example
> +
> + at item
> +Output specific data about the minimum and maximum values of the Y plane per frame:
> + at example
> +ffprobe -f lavfi movie=example.mov,signalstats -show_entries frame_tags=lavfi.values.YMAX,lavfi.values.YMIN

Do we have a namespace resolution scheme? What about lavfi.signalstats.VAL?

> + at end example
> +
> + at item
> +Playback video while highlighting pixels that are outside of broadcast range in red.
> + at example
> +ffplay example.mov -vf values="out=brng:color=red"
> + at end example
> + at end itemize
> +
>  @anchor{smartblur}
>  @section smartblur
>  
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index f981dfa..142c06e 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -174,6 +174,7 @@ OBJS-$(CONFIG_SETSAR_FILTER)                 += vf_aspect.o
>  OBJS-$(CONFIG_SETTB_FILTER)                  += settb.o
>  OBJS-$(CONFIG_SHOWINFO_FILTER)               += vf_showinfo.o
>  OBJS-$(CONFIG_SHUFFLEPLANES_FILTER)          += vf_shuffleplanes.o
> +OBJS-$(CONFIG_SIGNALSTATS_FILTER)            += vf_signalstats.o
>  OBJS-$(CONFIG_SMARTBLUR_FILTER)              += vf_smartblur.o
>  OBJS-$(CONFIG_SPLIT_FILTER)                  += split.o
>  OBJS-$(CONFIG_SPP_FILTER)                    += vf_spp.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 22d643d..4b9db9e 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -191,6 +191,7 @@ void avfilter_register_all(void)
>      REGISTER_FILTER(SETTB,          settb,          vf);
>      REGISTER_FILTER(SHOWINFO,       showinfo,       vf);
>      REGISTER_FILTER(SHUFFLEPLANES,  shuffleplanes,  vf);
> +    REGISTER_FILTER(SIGNALSTATS,    signalstats,    vf);
>      REGISTER_FILTER(SMARTBLUR,      smartblur,      vf);
>      REGISTER_FILTER(SPLIT,          split,          vf);
>      REGISTER_FILTER(SPP,            spp,            vf);
> diff --git a/libavfilter/vf_signalstats.c b/libavfilter/vf_signalstats.c
> new file mode 100644
> index 0000000..e06bfe1
> --- /dev/null
> +++ b/libavfilter/vf_signalstats.c
> @@ -0,0 +1,479 @@
> +/*
> + * Copyright (c) 2010 Mark Heath mjpeg0 @ silicontrip dot org
> + * Copyright (c) 2014 Clément Bœsch
> + * Copyright (c) 2014 Dave Rice @dericed
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include "libavutil/opt.h"
> +#include "libavutil/pixdesc.h"
> +#include "libavcodec/mathops.h"
> +#include "internal.h"
> +
> +enum FilterMode {
> +    FILTER_NONE = -1,
> +    FILTER_TOUT,
> +    FILTER_VREP,
> +    FILTER_BRNG,
> +    FILT_NUMB
> +};
> +
> +typedef struct {
> +    const AVClass *class;
> +    int chromah;
> +    int chromaw;
> +    int hsub;
> +    int vsub;

> +    int fs;
> +    int cfs;

nit: add comments

> +    enum FilterMode outfilter;
> +    int filters;
> +    AVFrame *frame_prev;
> +    char *vrep_line;
> +    uint8_t rgba_color[4];
> +    int yuv_color[3];
> +} SignalstatsContext;
> +
> +#define OFFSET(x) offsetof(SignalstatsContext, x)
> +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
> +
> +static const AVOption signalstats_options[] = {
> +    {"stat", "set statistics filters", OFFSET(filters), AV_OPT_TYPE_FLAGS, {.i64=0}, 0, INT_MAX, FLAGS, "filters"},
> +        {"tout", "analyze pixels for temporal outliers",                0, AV_OPT_TYPE_CONST, {.i64=1<<FILTER_TOUT}, 0, 0, FLAGS, "filters"},
> +        {"vrep", "analyze video lines for vertical line repitition",    0, AV_OPT_TYPE_CONST, {.i64=1<<FILTER_VREP}, 0, 0, FLAGS, "filters"},
> +        {"brng", "analyze for pixels outside of broadcast range",       0, AV_OPT_TYPE_CONST, {.i64=1<<FILTER_BRNG}, 0, 0, FLAGS, "filters"},
> +    {"out", "set video filter", OFFSET(outfilter), AV_OPT_TYPE_INT, {.i64=FILTER_NONE}, -1, FILT_NUMB-1, FLAGS, "out"},
> +        {"tout", "highlight pixels that depict temporal outliers",              0, AV_OPT_TYPE_CONST, {.i64=FILTER_TOUT}, 0, 0, FLAGS, "out"},
> +        {"vrep", "highlight video lines that depict vertical line repitition",  0, AV_OPT_TYPE_CONST, {.i64=FILTER_VREP}, 0, 0, FLAGS, "out"},
> +        {"brng", "highlight pixels that are outside of broadcast range",        0, AV_OPT_TYPE_CONST, {.i64=FILTER_BRNG}, 0, 0, FLAGS, "out"},
> +    {"c",     "set highlight color", OFFSET(rgba_color), AV_OPT_TYPE_COLOR, {.str="yellow"}, .flags=FLAGS},
> +    {"color", "set highlight color", OFFSET(rgba_color), AV_OPT_TYPE_COLOR, {.str="yellow"}, .flags=FLAGS},
> +    {NULL}
> +};
> +
> +AVFILTER_DEFINE_CLASS(signalstats);
> +
> +static av_cold int init(AVFilterContext *ctx)
> +{
> +    uint8_t r, g, b;
> +    SignalstatsContext *s = ctx->priv;
> +
> +    if (s->outfilter != FILTER_NONE)
> +        s->filters |= 1 << s->outfilter;
> +
> +    r = s->rgba_color[0];
> +    g = s->rgba_color[1];
> +    b = s->rgba_color[2];

> +    s->yuv_color[0] = (( 66*r + 129*g +  25*b + (1<<7)) >> 8) +  16;
> +    s->yuv_color[1] = ((-38*r + -74*g + 112*b + (1<<7)) >> 8) + 128;
> +    s->yuv_color[2] = ((112*r + -94*g + -18*b + (1<<7)) >> 8) + 128;

Unrelated note: having an API for this transform would be useful in
several places.

> +    return 0;
> +}
> +
> +static av_cold void uninit(AVFilterContext *ctx)
> +{
> +    SignalstatsContext *s = ctx->priv;
> +    av_frame_free(&s->frame_prev);
> +    av_freep(&s->vrep_line);
> +}
> +
> +static int query_formats(AVFilterContext *ctx)
> +{
> +    // TODO: add more
> +    enum AVPixelFormat pix_fmts[] = {
> +        AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV420P, AV_PIX_FMT_YUV411P,
> +        AV_PIX_FMT_NONE
> +    };
> +
> +    ff_set_common_formats(ctx, ff_make_format_list(pix_fmts));
> +    return 0;
> +}
> +
> +static int config_props(AVFilterLink *outlink)
> +{
> +    AVFilterContext *ctx = outlink->src;
> +    SignalstatsContext *s = ctx->priv;
> +    AVFilterLink *inlink = outlink->src->inputs[0];
> +    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(outlink->format);
> +    s->hsub = desc->log2_chroma_w;
> +    s->vsub = desc->log2_chroma_h;
> +
> +    outlink->w = inlink->w;
> +    outlink->h = inlink->h;
> +
> +    s->chromaw = FF_CEIL_RSHIFT(inlink->w, s->hsub);
> +    s->chromah = FF_CEIL_RSHIFT(inlink->h, s->vsub);
> +
> +    s->fs = inlink->w * inlink->h;
> +    s->cfs = s->chromaw * s->chromah;
> +
> +    if (s->filters & 1<<FILTER_VREP) {
> +        s->vrep_line = av_malloc(inlink->h * sizeof(*s->vrep_line));
> +        if (!s->vrep_line)
> +            return AVERROR(ENOMEM);
> +    }
> +
> +    return 0;
> +}
> +
> +static void burn_frame(SignalstatsContext *s, AVFrame *f, int x, int y)
> +{
> +    const int chromax = x >> s->hsub;
> +    const int chromay = y >> s->vsub;
> +    f->data[0][y       * f->linesize[0] +       x] = s->yuv_color[0];
> +    f->data[1][chromay * f->linesize[1] + chromax] = s->yuv_color[1];
> +    f->data[2][chromay * f->linesize[2] + chromax] = s->yuv_color[2];
> +}
> +
> +static int filter_brng(SignalstatsContext *s, const AVFrame *in, AVFrame *out, int y, int w, int h)
> +{
> +    int x, score = 0;
> +    const int yc = y >> s->vsub;
> +    const uint8_t *pluma    = &in->data[0][y  * in->linesize[0]];
> +    const uint8_t *pchromau = &in->data[1][yc * in->linesize[1]];
> +    const uint8_t *pchromav = &in->data[2][yc * in->linesize[2]];
> +
> +    for (x = 0; x < w; x++) {
> +        const int xc = x >> s->hsub;
> +        const int luma    = pluma[x];
> +        const int chromau = pchromau[xc];
> +        const int chromav = pchromav[xc];
> +        const int filt = luma    < 16 || luma    > 235 ||
> +                         chromau < 16 || chromau > 240 ||
> +                         chromav < 16 || chromav > 240;
> +        score += filt;
> +        if (out && filt)
> +            burn_frame(s, out, x, y);
> +    }
> +    return score;
> +}
> +
> +static int filter_tout_outlier(uint8_t x, uint8_t y, uint8_t z)
> +{
> +    return ((abs(x - y) + abs (z - y)) / 2) - abs(z - x) > 4; // make 4 configurable?
> +}
> +
> +static int filter_tout(SignalstatsContext *s, const AVFrame *in, AVFrame *out, int y, int w, int h)
> +{
> +    const uint8_t *p = in->data[0];
> +    int lw = in->linesize[0];
> +    int x, score = 0, filt;
> +
> +    if (y - 1 < 0 || y + 1 >= h)
> +        return 0;
> +
> +    // detect two pixels above and below (to eliminate interlace artefacts)

> +    // should check that video format is infact interlace.

typo: interlaced?

> +#define FILTER(i, j) \
> +filter_tout_outlier(p[(y-j) * lw + x + i], \
> +                    p[    y * lw + x + i], \
> +                    p[(y+j) * lw + x + i])
> +
> +#define FILTER3(j) (FILTER(-1, j) && FILTER(0, j) && FILTER(1, j))
> +
> +    if (y - 2 >= 0 && y + 2 < h) {
> +        for (x = 1; x < w - 1; x++) {
> +            filt = FILTER3(2) && FILTER3(1);
> +            score += filt;
> +            if (filt && out)
> +                burn_frame(s, out, x, y);
> +        }
> +    } else {
> +        for (x = 1; x < w - 1; x++) {
> +            filt = FILTER3(1);
> +            score += filt;
> +            if (filt && out)
> +                burn_frame(s, out, x, y);
> +        }
> +    }
> +    return score;
> +}
> +
> +#define VREP_START 4
> +
> +static void filter_init_vrep(SignalstatsContext *s, const AVFrame *p, int w, int h)
> +{
> +    int i, y;
> +    int lw = p->linesize[0];
> +
> +    for (y = VREP_START; y < h; y++) {
> +        int totdiff = 0;
> +        int y2lw = (y - VREP_START) * lw;
> +        int ylw = y * lw;
> +
> +        for (i = 0; i < w; i++)
> +            totdiff += abs(p->data[0][y2lw + i] - p->data[0][ylw + i]);
> +
> +        /* this value should be definable */
> +        s->vrep_line[y] = totdiff < w;
> +    }
> +}
> +
> +static int filter_vrep(SignalstatsContext *s, const AVFrame *in, AVFrame *out, int y, int w, int h)
> +{
> +    int x, score = 0;
> +
> +    if (y < VREP_START)
> +        return 0;
> +
> +    for (x = 0; x < w; x++) {
> +        if (s->vrep_line[y]) {
> +            score++;
> +            if (out)
> +                burn_frame(s, out, x, y);
> +        }
> +    }
> +    return score;
> +}
> +
> +static const struct {
> +    const char *name;
> +    void (*init)(SignalstatsContext *s, const AVFrame *p, int w, int h);
> +    int (*process)(SignalstatsContext *s, const AVFrame *in, AVFrame *out, int y, int w, int h);
> +} filters_def[] = {
> +    {"TOUT", NULL,              filter_tout},
> +    {"VREP", filter_init_vrep,  filter_vrep},
> +    {"BRNG", NULL,              filter_brng},
> +    {NULL}
> +};
> +
> +#define DEPTH 256
> +
> +static int filter_frame(AVFilterLink *link, AVFrame *in)
> +{
> +    SignalstatsContext *s = link->dst->priv;
> +    AVFilterLink *outlink = link->dst->outputs[0];
> +    AVFrame *out = in;
> +    int i, j;
> +    int  w = 0,  cw = 0, // in
> +        pw = 0, cpw = 0; // prev
> +    int yuv, yuvu, yuvv;
> +    int fil;
> +    char metabuf[128];
> +    unsigned int histy[DEPTH] = {0},
> +                 histu[DEPTH] = {0},
> +                 histv[DEPTH] = {0},
> +                 histhue[360] = {0},
> +                 histsat[DEPTH] = {0}; // limited to 8 bit data.
> +    int miny  = -1, minu  = -1, minv  = -1;
> +    int maxy  = -1, maxu  = -1, maxv  = -1;
> +    int lowy  = -1, lowu  = -1, lowv  = -1;
> +    int highy = -1, highu = -1, highv = -1;
> +    int minsat = -1, maxsat = -1, lowsat = -1, highsat = -1;
> +    int lowp, highp, clowp, chighp;
> +    int accy, accu, accv;

> +    int accsat, acchue=0;

nit+++: acchue = 0;

> +    int medhue, maxhue;
> +    int toty = 0, totu = 0, totv = 0, totsat=0;
> +    int tothue = 0;
> +    int dify = 0, difu = 0, difv = 0;
> +
> +    int filtot[FILT_NUMB] = {0};
> +    AVFrame *prev;
> +
> +    if (!s->frame_prev)
> +        s->frame_prev = av_frame_clone(in);
> +
> +    prev = s->frame_prev;
> +
> +    if (s->outfilter != FILTER_NONE)
> +        out = av_frame_clone(in);
> +
> +    for (fil = 0; fil < FILT_NUMB; fil ++)
> +        if ((s->filters & 1<<fil) && filters_def[fil].init)
> +            filters_def[fil].init(s, in, link->w, link->h);
> +
> +    // Calculate luma histogram and difference with previous frame or field.
> +    for (j = 0; j < link->h; j++) {
> +        for (i = 0; i < link->w; i++) {
> +            yuv = in->data[0][w + i];
> +            histy[yuv]++;
> +            dify += abs(in->data[0][w + i] - prev->data[0][pw + i]);
> +        }
> +        w  += in->linesize[0];
> +        pw += prev->linesize[0];
> +    }
> +
> +    // Calculate chroma histogram and difference with previous frame or field.
> +    for (j = 0; j < s->chromah; j++) {
> +        for (i = 0; i < s->chromaw; i++) {
> +            int sat, hue;
> +
> +            yuvu = in->data[1][cw+i];
> +            yuvv = in->data[2][cw+i];
> +            histu[yuvu]++;
> +            difu += abs(in->data[1][cw+i] - prev->data[1][cpw+i]);
> +            histv[yuvv]++;
> +            difv += abs(in->data[2][cw+i] - prev->data[2][cpw+i]);
> +
> +            // int or round?
> +            sat = ff_sqrt((yuvu-128) * (yuvu-128) + (yuvv-128) * (yuvv-128));
> +            histsat[sat]++;
> +            hue = floor((180 / M_PI) * atan2f(yuvu-128, yuvv-128) + 180);
> +            histhue[hue]++;
> +        }
> +        cw  += in->linesize[1];
> +        cpw += prev->linesize[1];
> +    }
> +
> +    for (j = 0; j < link->h; j++) {
> +        for (fil = 0; fil < FILT_NUMB; fil ++) {
> +            if (s->filters & 1<<fil) {
> +                AVFrame *dbg = out != in && s->outfilter == fil ? out : NULL;
> +                filtot[fil] += filters_def[fil].process(s, in, dbg, j, link->w, link->h);
> +            }
> +        }
> +    }
> +
> +    // find low / high based on histogram percentile
> +    // these only need to be calculated once.
> +
> +    lowp   = s->fs  * 10 / 100;
> +    highp  = s->fs  * 90 / 100;
> +    clowp  = s->cfs * 10 / 100;
> +    chighp = s->cfs * 90 / 100;

I wonder if we should make the percentile value parametric (for
example setting the percentile margin M and then computing 1-M and M
values).

[...]
-- 
FFmpeg = Fostering and Fabulous Multimedia Practical Elected Gorilla