[FFmpeg-devel] [PATCH] libavfilter: created a new filter that obtains the average peak signal-to-noise ratio (PSNR) of two input video files in YUV format.
Stefano Sabatini
stefano.sabatini-lala at poste.it
Fri Jun 10 01:18:38 CEST 2011
On date Tuesday 2011-06-07 14:03:36 +0200, Roger Pau Monné encoded:
> 2011/6/7 Stefano Sabatini <stefano.sabatini-lala at poste.it>:
[...]
> >> I'm not sure about how to obtain the max value of pixel formats, if
> >> someone knows it, I will gladly expand the filter to compute the PSNR
> >> for other formats.
> >
> > We don't have this information, but we need to get it from each filter
> > (check for example my recently posted lut filter).
>
> Done! Added support for RGB formats.
[...]
> From dee17a0204fae30eac403f1fe4097a7201376e72 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Roger=20Pau=20Monn=E9?= <roger.pau at entel.upc.edu>
> Date: Tue, 7 Jun 2011 14:00:51 +0200
> Subject: [PATCH] libavfilter: created a new filter that obtains the average peak signal-to-noise ratio (PSNR) of two input video files.
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
>
> Signed-off-by: Roger Pau Monn? <roger.pau at entel.upc.edu>
> ---
> doc/filters.texi | 38 ++++++
> libavfilter/Makefile | 1 +
> libavfilter/allfilters.c | 1 +
> libavfilter/vf_psnr.c | 332 ++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 372 insertions(+), 0 deletions(-)
> create mode 100644 libavfilter/vf_psnr.c
>
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 719d94f..4976b94 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -1088,6 +1088,44 @@ format=monow, pixdesctest
>
> can be used to test the monowhite pixel format descriptor definition.
>
> + at section psnr
> +
> +Obtain the average, maximum and minimum PSNR between two input videos.
> +Both video files must have the same resolution and pixel format for
> +this filter to work correctly. The obtained average PSNR is printed
> +through the logging system.
> +
> +The filter stores the accumulated MSE (mean squared error) of each
> +frame, and at the end of the processing it is averaged across all frames
> +equally, and the following formula is applied to obtain the PSNR:
> +
> + at example
> +PSNR = 10log10(MAX^2/MSE)
Nit: 10_log10(MAX^2/MSE)
> + at end example
> +
> +Where MAX is the average of the maximum values of each component of the
> +image.
> +
> +This filter accepts the following parameters:
> +
> + at table @option
> + at item vstats
> +Parameter which specifies the file used to save the PSNR of each
> +individual frame. If not specified the filter will not print the PSNR
> +of each individual frame.
> + at end table
Nit: since it only accepts one parameter (for the moment):
This filter accepts in input the filename used to save the PSNR of each...
> +
> +For example:
> + at example
> +movie=ref_movie.mpg, setpts=PTS-STARTPTS [ref]; [in] setpts=PTS-STARTPTS,
> +[ref] psnr=stats.log [out]
> + at end example
> +
> +On this example the input file being processed by FFmpeg is compared
^^^^^^^^^^^^^^^^^^^^^^^^^
Nit: unnecessary (since this documents the filter, which can be used
without ffmpeg the tool).
> +with the reference file ref_movie.mpg. The PSNR of each individual
> +frame is stored in stats.log. Setpts filters are used to synchronize
@file{stats.log}.
> +both streams.
> +
> @section scale
>
> Scale the input video to @var{width}:@var{height} and/or convert the image format.
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 2324fb9..2c66f7e 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -45,6 +45,7 @@ OBJS-$(CONFIG_OCV_FILTER) += vf_libopencv.o
> OBJS-$(CONFIG_OVERLAY_FILTER) += vf_overlay.o
> OBJS-$(CONFIG_PAD_FILTER) += vf_pad.o
> OBJS-$(CONFIG_PIXDESCTEST_FILTER) += vf_pixdesctest.o
> +OBJS-$(CONFIG_PSNR_FILTER) += vf_psnr.o
> OBJS-$(CONFIG_SCALE_FILTER) += vf_scale.o
> OBJS-$(CONFIG_SELECT_FILTER) += vf_select.o
> OBJS-$(CONFIG_SETDAR_FILTER) += vf_aspect.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 5f1065f..69a52e1 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -61,6 +61,7 @@ void avfilter_register_all(void)
> REGISTER_FILTER (OVERLAY, overlay, vf);
> REGISTER_FILTER (PAD, pad, vf);
> REGISTER_FILTER (PIXDESCTEST, pixdesctest, vf);
> + REGISTER_FILTER (PSNR, psnr, vf);
> REGISTER_FILTER (SCALE, scale, vf);
> REGISTER_FILTER (SELECT, select, vf);
> REGISTER_FILTER (SETDAR, setdar, vf);
> diff --git a/libavfilter/vf_psnr.c b/libavfilter/vf_psnr.c
> new file mode 100644
> index 0000000..02cf30a
> --- /dev/null
> +++ b/libavfilter/vf_psnr.c
> @@ -0,0 +1,332 @@
> +/*
> + * Copyright (c) 2011 Roger Pau Monn?? <roger.pau at entel.upc.edu>
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * Caculate the PSNR between two input videos.
> + * Based on the overlay filter.
> + */
> +
> +#include "libavutil/pixdesc.h"
> +#include "avfilter.h"
> +
> +#undef fprintf
> +
> +#define YUV_FORMATS \
> + PIX_FMT_YUV444P, PIX_FMT_YUV422P, PIX_FMT_YUV420P, \
> + PIX_FMT_YUV411P, PIX_FMT_YUV410P, PIX_FMT_YUV440P, \
> + PIX_FMT_YUVA420P, \
> + PIX_FMT_YUVJ444P, PIX_FMT_YUVJ422P, PIX_FMT_YUVJ420P, \
> + PIX_FMT_YUVJ440P
> +
> +#define RGB_FORMATS \
> + PIX_FMT_ARGB, PIX_FMT_RGBA, \
> + PIX_FMT_ABGR, PIX_FMT_BGRA, \
> + PIX_FMT_RGB24, PIX_FMT_BGR24
> +
> +static enum PixelFormat yuv_pix_fmts[] = { YUV_FORMATS, PIX_FMT_NONE };
> +static enum PixelFormat rgb_pix_fmts[] = { RGB_FORMATS, PIX_FMT_NONE };
> +static enum PixelFormat all_pix_fmts[] = { RGB_FORMATS, YUV_FORMATS, PIX_FMT_NONE };
> +
> +typedef struct {
> + AVFilterBufferRef *picref;
> + double mse, min_mse, max_mse;
> + int nb_frames;
> + FILE *vstats_file;
> + uint16_t *line1, *line2;
> + int max[4], average_max;
> + int is_yuv, is_rgb;
> +} PSNRContext;
> +
> +static int pix_fmt_is_in(enum PixelFormat pix_fmt, enum PixelFormat *pix_fmts)
> +{
> + enum PixelFormat *p;
> + for (p = pix_fmts; *p != PIX_FMT_NONE; p++) {
> + if (pix_fmt == *p)
> + return 1;
> + }
> + return 0;
> +}
> +
> +static inline int pow2(int base)
> +{
> + return base*base;
> +}
> +
> +static inline double get_psnr(double mse, int nb_frames, int max)
> +{
> + return 10.0*log((pow2(max))/(mse/nb_frames))/log(10.0);
> +}
> +
> +static inline
> +void compute_images_mse(const uint8_t *ref_data[4],
> + const uint8_t *data[4], const int linesizes[4],
> + int w, int h, const AVPixFmtDescriptor *desc,
> + double mse[4], uint16_t *line1, uint16_t *line2)
> +{
> + int i, c, j = w;
> +
> + memset(mse, 0, sizeof(*mse)*4);
> +
> + for (c = 0; c < desc->nb_components; c++) {
> + int w1 = c == 1 || c == 2 ? w>>desc->log2_chroma_w : w;
> + int h1 = c == 1 || c == 2 ? h>>desc->log2_chroma_h : h;
> +
> + for (i = 0; i < h1; i++) {
> + av_read_image_line(line1,
> + ref_data,
> + linesizes,
> + desc,
> + 0, i, c, w1, 0);
> + av_read_image_line(line2,
> + data,
> + linesizes,
> + desc,
> + 0, i, c, w1, 0);
> + for (j = 0; j < w1; j++)
> + mse[c] += pow2(line1[j] - line2[j]);
> + }
> + mse[c] /= w1*h1;
> + }
> +}
> +
> +
> +static av_cold int init(AVFilterContext *ctx, const char *args, void *opaque)
> +{
> + PSNRContext *psnr = ctx->priv;
> +
> + psnr->mse = psnr->nb_frames = 0;
> + psnr->min_mse = psnr->max_mse = -1.0;
> + psnr->picref = NULL;
> + psnr->line1 = psnr->line2 = NULL;
> +
> + if (args != NULL && strlen(args) > 0) {
> + psnr->vstats_file = fopen(args, "w");
> + if (!psnr->vstats_file) {
> + av_log(ctx, AV_LOG_ERROR,
> + "Could not open stats file %s: %s\n", args, strerror(errno));
> + return AVERROR(EINVAL);
> + }
> + }
> +
> + return 0;
> +}
> +
> +static av_cold void uninit(AVFilterContext *ctx)
> +{
> + PSNRContext *psnr = ctx->priv;
> +
> + av_log(ctx, AV_LOG_INFO, "PSNR average:%0.2fdB min:%0.2fdB max:%0.2fdB\n",
> + get_psnr(psnr->mse, psnr->nb_frames, psnr->average_max),
> + get_psnr(psnr->max_mse, 1, psnr->average_max),
> + get_psnr(psnr->min_mse, 1, psnr->average_max));
> +
> + if (psnr->picref) {
> + avfilter_unref_buffer(psnr->picref);
> + psnr->picref = NULL;
> + }
> +
> + av_freep(&psnr->line1);
> + av_freep(&psnr->line2);
> +
> + if (psnr->vstats_file)
> + fclose(psnr->vstats_file);
> +}
> +
> +static int config_input_ref(AVFilterLink *inlink)
> +{
> + AVFilterContext *ctx = inlink->dst;
> + PSNRContext *psnr = ctx->priv;
> +
> + if (ctx->inputs[0]->w != ctx->inputs[1]->w ||
> + ctx->inputs[0]->h != ctx->inputs[1]->h) {
> + av_log(ctx, AV_LOG_ERROR,
> + "Width and/or heigth of input videos are different, could not calculate PSNR\n");
> + return AVERROR(EINVAL);
> + }
> + if (ctx->inputs[0]->format != ctx->inputs[1]->format) {
> + av_log(ctx, AV_LOG_ERROR,
> + "Input filters have different pixel formats, could not calculate PSNR\n");
> + return AVERROR(EINVAL);
> + }
> +
> + if (!(psnr->line1 = av_malloc(sizeof(*psnr->line1) * inlink->w)) ||
> + !(psnr->line2 = av_malloc(sizeof(*psnr->line2) * inlink->w)))
> + return AVERROR(ENOMEM);
> +
> + switch (inlink->format) {
> + case PIX_FMT_YUV410P:
> + case PIX_FMT_YUV411P:
> + case PIX_FMT_YUV420P:
> + case PIX_FMT_YUV422P:
> + case PIX_FMT_YUV440P:
> + case PIX_FMT_YUV444P:
> + case PIX_FMT_YUVA420P:
> + psnr->max[0] = 235;
psnr->max[3] = 255;
at least I suppose this is the max alpha value in
psnr->YUVA420P, yes I forgot this in the lut filter
> + psnr->max[1] = psnr->max[2] = 240;
> + break;
> + default:
> + psnr->max[0] = psnr->max[1] = psnr->max[2] = psnr->max[3] = 255;
> + }
> +
> + psnr->is_yuv = psnr->is_rgb = 0;
Nit++: maybe this can be initialized in init.
> + if (pix_fmt_is_in(inlink->format, yuv_pix_fmts)) psnr->is_yuv = 1;
> + else if (pix_fmt_is_in(inlink->format, rgb_pix_fmts)) psnr->is_rgb = 1;
> +
> + for(int j = 0; j < av_pix_fmt_descriptors[inlink->format].nb_components; j++)
> + psnr->average_max += psnr->max[j];
> + psnr->average_max /= av_pix_fmt_descriptors[inlink->format].nb_components;
> +
> + return 0;
> +}
> +
> +static int query_formats(AVFilterContext *ctx)
> +{
> + avfilter_set_common_formats(ctx, avfilter_make_format_list(all_pix_fmts));
> + return 0;
> +}
> +
> +static void start_frame(AVFilterLink *inlink, AVFilterBufferRef *inpicref)
> +{
> + AVFilterBufferRef *outpicref = avfilter_ref_buffer(inpicref, ~0);
> + AVFilterContext *ctx = inlink->dst;
> + PSNRContext *psnr = ctx->priv;
> +
> + inlink->dst->outputs[0]->out_buf = outpicref;
> + outpicref->pts = av_rescale_q(inpicref->pts,
> + ctx->inputs [0]->time_base,
> + ctx->outputs[0]->time_base);
> +
> + if (psnr->picref) {
> + avfilter_unref_buffer(psnr->picref);
> + psnr->picref = NULL;
> + }
> + avfilter_request_frame(ctx->inputs[1]);
> +
> + avfilter_start_frame(inlink->dst->outputs[0], outpicref);
> +}
> +
> +static void start_frame_ref(AVFilterLink *inlink, AVFilterBufferRef *inpicref)
> +{
> + AVFilterContext *ctx = inlink->dst;
> + PSNRContext *psnr = ctx->priv;
> +
> + psnr->picref = inpicref;
> + psnr->picref->pts = av_rescale_q(inpicref->pts,
> + ctx->inputs [1]->time_base,
> + ctx->outputs[0]->time_base);
> +}
> +
> +static void end_frame(AVFilterLink *inlink)
> +{
> + AVFilterContext *ctx = inlink->dst;
> + PSNRContext *psnr = ctx->priv;
> + AVFilterLink *outlink = ctx->outputs[0];
> + AVFilterBufferRef *outpic = outlink->out_buf;
> + AVFilterBufferRef *ref = psnr->picref;
> + double mse[4];
> + double mse_t = 0;
> + int j;
> +
> + if (psnr->picref) {
> + compute_images_mse((const uint8_t **)outpic->data, (const uint8_t **)ref->data,
> + outpic->linesize, outpic->video->w, outpic->video->h,
> + &av_pix_fmt_descriptors[inlink->format], mse,
> + psnr->line1, psnr->line2);
> +
> + for (j = 0; j < av_pix_fmt_descriptors[inlink->format].nb_components; j++)
> + mse_t += mse[j];
> + mse_t /= av_pix_fmt_descriptors[inlink->format].nb_components;
> +
> + if (psnr->min_mse == -1) {
> + psnr->min_mse = mse_t;
> + psnr->max_mse = mse_t;
> + }
> + if (psnr->min_mse > mse_t)
> + psnr->min_mse = mse_t;
> + if (psnr->max_mse < mse_t)
> + psnr->max_mse = mse_t;
> +
> + psnr->mse += mse_t;
> + psnr->nb_frames++;
> +
> + if (psnr->vstats_file) {
> + if(psnr->is_yuv)
> + fprintf(psnr->vstats_file,
> + "Frame:%d Y:%0.2fdB Cb:%0.2fdB Cr:%0.2fdB PSNR:%0.2fdB\n",
> + psnr->nb_frames,
> + get_psnr(mse[0], 1, psnr->max[0]),
> + get_psnr(mse[1], 1, psnr->max[1]),
> + get_psnr(mse[2], 1, psnr->max[2]),
> + get_psnr(mse_t, 1, psnr->average_max));
> + if(psnr->is_rgb) {
> + fprintf(psnr->vstats_file,
> + "Frame:%d R:%0.2fdB G:%0.2fdB B:%0.2fdB ",
> + psnr->nb_frames,
> + get_psnr(mse[0], 1, psnr->max[0]),
> + get_psnr(mse[1], 1, psnr->max[1]),
> + get_psnr(mse[2], 1, psnr->max[2]));
> + if(av_pix_fmt_descriptors[inlink->format].nb_components > 3)
> + fprintf(psnr->vstats_file,
> + "A:%0.2fdB ",
> + get_psnr(mse[3], 1, psnr->max[3]));
can be factorized:
char *comps[4];
comps[0] = psnr->is_yuv ? "Y" : "R" ;
comps[1] = psnr->is_yuv ? "Cb" : "G" ;
comps[2] = psnr->is_yuv ? "Cr" : "B" ;
comps[3] = "A";
fprintf(psnr->vstats_file, "Frame:%d ", psnr->nb_frames);
for (i = 0; i < av_pix_fmt_descriptors[inlink->format].nb_components; i++) {
c = psnr->is_rgb ? psnr->rgba_map[i] : i;
fprintf(psnr->vstats_file, "%s:%0.2fdB ", comps[c], get_psnr(mse[c], 1, psnr->max[c]));
}
note that you need the rgba_map or the RGBA component will not be
correctly mapped.
Sorry for the slow reply, and thanks for the nice work.
--
FFmpeg = Formidable Frightening Merciless Puritan Energized Gorilla
More information about the ffmpeg-devel
mailing list