[FFmpeg-devel] [PATCH v2 2/3] avcodec/h274: add film grain synthesis routine

James Almer jamrial at gmail.com
Wed Aug 18 18:41:25 EEST 2021


On 8/17/2021 4:25 PM, Niklas Haas wrote:
> From: Niklas Haas <git at haasn.dev>
> 
> This could arguably also be a vf, but I decided to put it here since
> decoders are technically required to apply film grain during the output
> step, and I would rather want to avoid requiring users insert the
> correct film grain synthesis filter on their own.
> 
> The code, while in C, is written in a way that unrolls/vectorizes fairly
> well under -O3, and is reasonably cache friendly. On my CPU, a single
> thread pushes about 400 FPS at 1080p.
> 
> Apart from hand-written assembly, one possible avenue of improvement
> would be to change the access order to compute the grain row-by-row
> rather than in 8x8 blocks. This requires some redundant PRNG calls, but
> would make the algorithm more cache-oblivious.
> 
> The implementation has been written to the wording of SMPTE RDD 5-2006
> as faithfully as I can manage. However, apart from passing a visual
> inspection, no guarantee of correctness can be made due to the lack of
> any publicly available reference implementation against which to
> compare it.
> 
> Signed-off-by: Niklas Haas <git at haasn.dev>
> ---
>   libavcodec/Makefile |   1 +
>   libavcodec/h274.c   | 811 ++++++++++++++++++++++++++++++++++++++++++++
>   libavcodec/h274.h   |  52 +++
>   3 files changed, 864 insertions(+)
>   create mode 100644 libavcodec/h274.c
>   create mode 100644 libavcodec/h274.h
> 
> diff --git a/libavcodec/Makefile b/libavcodec/Makefile
> index 9a6adb9903..21739b4064 100644
> --- a/libavcodec/Makefile
> +++ b/libavcodec/Makefile
> @@ -42,6 +42,7 @@ OBJS = ac3_parser.o                                                     \
>          dirac.o                                                          \
>          dv_profile.o                                                     \
>          encode.o                                                         \
> +       h274.o                                                           \
>          imgconvert.o                                                     \
>          jni.o                                                            \
>          mathtables.o                                                     \
> diff --git a/libavcodec/h274.c b/libavcodec/h274.c
> new file mode 100644
> index 0000000000..0efc00ca1d
> --- /dev/null
> +++ b/libavcodec/h274.c
> @@ -0,0 +1,811 @@
> +/*
> + * H.274 film grain synthesis
> + * Copyright (c) 2021 Niklas Haas <ffmpeg at haasn.xyz>
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * H.274 film grain synthesis.
> + * @author Niklas Haas <ffmpeg at haasn.xyz>
> + */
> +
> +#include "libavutil/avassert.h"
> +#include "libavutil/imgutils.h"
> +
> +#include "h274.h"
> +
> +// The code in this file has a lot of loops that vectorize very well, this is
> +// about a 40% speedup for no obvious downside.
> +#pragma GCC optimize("tree-vectorize")

Will this not break compilation with msvc and such?

Also, tree vectorization is know to cause issues in old GCC versions, 
and even recent ones. I don't know if this is worth the potential 
problems it could introduce, but i guess it can be done until someone 
writes simd.


More information about the ffmpeg-devel mailing list