[FFmpeg-devel] [PATCH] avformat/webpdec: WebP demuxer implementation
Andreas Rheinhardt
andreas.rheinhardt at outlook.com
Wed Sep 8 22:10:42 EEST 2021
yakoyoku at gmail.com:
> From: Martin Reboredo <yakoyoku at gmail.com>
>
> FFmpeg has the ability to mux encoded WebP packets, but it cannot demux the format.
> The purpose of this patch is to add a way to extract pictures from a WebP stream.
> Any other side data processing (mainly ICC profiles) is left up for later work.
> Although we have a demuxer with `image2`, it doesn't have support for animated frames like this patch.
>
> The WebP format is based on RIFF, and due to the charasteristics of the latter, I've took advantage from chunking for processing purposes.
> Package reading is done by taking chunks in a specific way. Starts by splitting the `RIFF`/`WEBP` header, then it goes by any of the three
> `VP8 ` (lossy)/`VP8L` (lossless)/`VP8X` (extended format). In the case of a `VP8X` chunk we check for relevant flags. We then follow by grabbing the
> `VP8 `/`ALPH` (alpha frame) + `VP8 `/`VP8L` chunks accourdingly. If the container specifies that is an animated package we take `ANIM` for the animation
> parameters and the many `ANMF` animation frames, which every of them contains an image chunk (`VP8 `/`ALPH` + `VP8 `/`VP8L`). Otherwise, if an unknown
> chunk is found, we just simply ignore it.
>
> Tested by remuxing WebP images (using `ffmpeg -i testa.webp -codec:v copy testb.webp`), viewed the images in my browser and compared the checksums.
>
> Mostly followed the WebP container specification [1] for the implementation, the VP8 bitstream [2] and the WebP lossless specs were used too.
>
> Partially fixes #4907.
>
> [1]: https://developers.google.com/speed/webp/docs/riff_container
> [2]: https://datatracker.ietf.org/doc/html/rfc6386
> [3]: https://developers.google.com/speed/webp/docs/webp_lossless_bitstream_specification
>
> Signed-off-by: Martin Reboredo <yakoyoku at gmail.com>
> ---
> .gitignore | 2 +
> MAINTAINERS | 1 +
> libavformat/Makefile | 1 +
> libavformat/allformats.c | 1 +
> libavformat/riff.c | 1 +
> libavformat/webpdec.c | 333 +++++++++++++++++++++++++++++++++++++++
> libavformat/webpenc.c | 13 +-
> 7 files changed, 349 insertions(+), 3 deletions(-)
> create mode 100644 libavformat/webpdec.c
>
> diff --git a/.gitignore b/.gitignore
> index 9ed24b542e..0e8334d227 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -23,6 +23,7 @@
> *.ptx.c
> *.ptx.gz
> *_g
> +compile_commands.json
> \#*
> .\#*
> /.config
> @@ -34,6 +35,7 @@
> /config.h
> /coverage.info
> /avversion.h
> +/.cache/
This should not be part of this patch; I don't even know whether it
should be committed at all. One can make ignore files by adding them to
.git/info/exclude which is not copied over with git clone. Maybe you
should add this to your exclude file?
> /lcov/
> /src
> /mapfile
> diff --git a/MAINTAINERS b/MAINTAINERS
> index dcac46003e..f2d8f5eb17 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -505,6 +505,7 @@ Muxers/Demuxers:
> wav.c Michael Niedermayer
> wc3movie.c Mike Melanson
> webm dash (matroskaenc.c) Vignesh Venkatasubramanian
> + webp*.c Martin Reboredo
> webvtt* Matthew J Heaney
> westwood.c Mike Melanson
> wtv.c Peter Ross
> diff --git a/libavformat/Makefile b/libavformat/Makefile
> index f7e47563da..aec2833c52 100644
> --- a/libavformat/Makefile
> +++ b/libavformat/Makefile
> @@ -581,6 +581,7 @@ OBJS-$(CONFIG_WEBM_MUXER) += matroskaenc.o matroska.o \
> OBJS-$(CONFIG_WEBM_DASH_MANIFEST_MUXER) += webmdashenc.o
> OBJS-$(CONFIG_WEBM_CHUNK_MUXER) += webm_chunk.o
> OBJS-$(CONFIG_WEBP_MUXER) += webpenc.o
> +OBJS-$(CONFIG_WEBP_DEMUXER) += webpdec.o
> OBJS-$(CONFIG_WEBVTT_DEMUXER) += webvttdec.o subtitles.o
> OBJS-$(CONFIG_WEBVTT_MUXER) += webvttenc.o
> OBJS-$(CONFIG_WSAUD_DEMUXER) += westwood_aud.o
> diff --git a/libavformat/allformats.c b/libavformat/allformats.c
> index 5471f7c16f..55f3c9a956 100644
> --- a/libavformat/allformats.c
> +++ b/libavformat/allformats.c
> @@ -473,6 +473,7 @@ extern const AVOutputFormat ff_webm_muxer;
> extern const AVInputFormat ff_webm_dash_manifest_demuxer;
> extern const AVOutputFormat ff_webm_dash_manifest_muxer;
> extern const AVOutputFormat ff_webm_chunk_muxer;
> +extern const AVInputFormat ff_webp_demuxer;
> extern const AVOutputFormat ff_webp_muxer;
> extern const AVInputFormat ff_webvtt_demuxer;
> extern const AVOutputFormat ff_webvtt_muxer;
> diff --git a/libavformat/riff.c b/libavformat/riff.c
> index 27a9706510..9bd940ba52 100644
> --- a/libavformat/riff.c
> +++ b/libavformat/riff.c
> @@ -321,6 +321,7 @@ const AVCodecTag ff_codec_bmp_tags[] = {
> { AV_CODEC_ID_VP7, MKTAG('V', 'P', '7', '1') },
> { AV_CODEC_ID_VP8, MKTAG('V', 'P', '8', '0') },
> { AV_CODEC_ID_VP9, MKTAG('V', 'P', '9', '0') },
> + { AV_CODEC_ID_WEBP, MKTAG('W', 'E', 'B', 'P') },
> { AV_CODEC_ID_ASV1, MKTAG('A', 'S', 'V', '1') },
> { AV_CODEC_ID_ASV2, MKTAG('A', 'S', 'V', '2') },
> { AV_CODEC_ID_VCR1, MKTAG('V', 'C', 'R', '1') },
> diff --git a/libavformat/webpdec.c b/libavformat/webpdec.c
> new file mode 100644
> index 0000000000..d2d95aea4e
> --- /dev/null
> +++ b/libavformat/webpdec.c
> @@ -0,0 +1,333 @@
> +/*
> + * webp demuxer
> + * Copyright (c) 2021 Martin Reboredo <yakoyoku at gmail.com>
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include "libavutil/intreadwrite.h"
> +#include "libavutil/mathematics.h"
> +#include "libavutil/opt.h"
> +#include "avformat.h"
> +#include "internal.h"
> +
> +typedef struct WebpDemuxContext {
> + AVClass *class;
> + int width;
> + int height;
> + AVPacket last_pkt;
> + int size;
> + int loop;
> + int read_webp_header;
> + int using_webp_anim_decoder;
> + int vp8x;
> + int lossless;
> + int alpha;
> + int icc;
> +} WebpDemuxContext;
> +
> +static int webpdec_read_probe(const AVProbeData * p)
> +{
> + if (AV_RL32(p->buf) != AV_RL32("RIFF"))
> + return 0;
> +
> + if (AV_RL32(&p->buf[8]) != AV_RL32("WEBP"))
> + return 0;
> +
> + return AVPROBE_SCORE_MAX;
> +}
> +
> +static int parse_animation_frame_duration(AVFormatContext * s, AVPacket * pkt)
> +{
> + pkt->duration = av_rescale_q(AV_RL24(pkt->data + 20),
> + (AVRational) { 1, 1000 },
> + s->streams[0]->time_base);
> +
> + return 0;
> +}
> +
> +static int parse_vp8x_chunk(AVFormatContext * s, AVPacket * pkt)
> +{
> + WebpDemuxContext *w = s->priv_data;
> + AVIOContext *pb = s->pb;
> + int bgcolor = 0xFFFFFFFF;
> + int cont = 1, anim_frame = 0, alpha_frame = 0;
> + int64_t ret = 0;
> +
> + s->packet_size = 0;
> +
> + while (cont && ret >= 0) {
> + int skip = 0, rewind = 1;
> + int fourcc = avio_rl32(pb);
> + int size = avio_rl32(pb);
> + int padded_size = size + (size & 1);
> + int chunk_size = padded_size + 8;
> + s->packet_size += chunk_size;
> +
> + if (padded_size == 0)
> + return AVERROR_EOF;
> +
> + switch (fourcc) {
> + case MKTAG('V', 'P', '8', 'X'):
> + return AVERROR_INVALIDDATA;
> + /* case MKTAG('I', 'C', 'C', 'P'):
> + avio_read(pb, w->iccp_data, padded_size); */
> + case MKTAG('A', 'L', 'P', 'H'):
> + if (!w->alpha || alpha_frame == 1)
> + return AVERROR_INVALIDDATA;
> + if (w->using_webp_anim_decoder && anim_frame == 0)
> + return AVERROR_INVALIDDATA;
> +
> + alpha_frame = 1;
> + break;
> + case MKTAG('V', 'P', '8', 'L'):
> + alpha_frame = 1;
> + case MKTAG('V', 'P', '8', ' '):
> + if (w->alpha && alpha_frame == 0)
> + return AVERROR_INVALIDDATA;
> + if (w->using_webp_anim_decoder && anim_frame == 0)
> + return AVERROR_INVALIDDATA;
> +
> + cont = 0;
> + break;
> + case MKTAG('A', 'N', 'I', 'M'):
> + if (w->loop == -1) {
> + bgcolor = avio_rl32(pb);
> + w->loop = avio_rl16(pb);
> +
> + ret = avio_seek(pb, -14, SEEK_CUR);
> + if (ret < 0)
> + return ret;
> +
> + ret = av_get_packet(pb, pkt, s->packet_size);
> + if (ret < 0)
> + return ret;
> +
> + return 0;
> + }
> + cont = 0;
> + break;
> + case MKTAG('A', 'N', 'M', 'F'):
> + if (!w->using_webp_anim_decoder || anim_frame == 1)
> + return AVERROR_INVALIDDATA;
> +
> + ret = avio_seek(pb, -8, SEEK_CUR);
> + if (ret < 0)
> + return ret;
> +
> + ret = av_get_packet(pb, pkt, s->packet_size);
> + if (ret < 0)
> + return ret;
> +
> + ret = parse_animation_frame_duration(s, pkt);
> + if (ret < 0)
> + return ret;
> +
> + anim_frame = 1;
> + rewind = 0;
> + return 0;
> + default:
> + s->packet_size -= chunk_size;
> + skip = 1;
> + rewind = 0;
> + break;
> + }
> +
> + if (skip) {
> + ret = avio_skip(pb, padded_size);
> + }
> + if (rewind) {
> + ret = avio_seek(pb, -8, SEEK_CUR);
> + if (ret < 0)
> + return ret;
> + ret = av_append_packet(pb, pkt, chunk_size);
> + }
> + }
> +
> + return ret;
> +}
> +
> +static int parse_header(AVFormatContext * s)
> +{
> + WebpDemuxContext *w = s->priv_data;
> + AVIOContext *pb = s->pb;
> + int size;
> + unsigned int flags = 0;
> + int ret = 0;
> +
> + if (avio_rl32(pb) != AV_RL32("RIFF"))
> + return AVERROR_INVALIDDATA;
> + w->size = avio_rl32(pb) + 8;
> + if (avio_rl32(pb) != AV_RL32("WEBP"))
> + return AVERROR_INVALIDDATA;
> +
> + if (avio_rl24(pb) != AV_RL24("VP8"))
> + return AVERROR_INVALIDDATA;
> + switch (avio_r8(pb)) {
> + case 'X':
> + w->vp8x = 1;
> + break;
> + case 'L':
> + w->lossless = 1;
> + case ' ':
> + break;
> + default:
> + return AVERROR_INVALIDDATA;
> + }
> + size = avio_rl32(pb);
> + if (w->vp8x) {
> + flags = avio_r8(pb);
> +
> + if (flags & 0x02)
> + w->using_webp_anim_decoder = 1;
> + if (flags & 0x10)
> + w->alpha = 1;
> + if (flags & 0x20)
> + w->icc = 1;
> +
> + ret = avio_skip(pb, 3);
> + if (ret < 0)
> + return ret;
> +
> + w->width = avio_rl24(pb) + 1;
> + w->height = avio_rl24(pb) + 1;
> +
> + ret = avio_seek(pb, -30, SEEK_CUR);
> + } else if (w->lossless) {
> + avio_r8(pb);
> + flags = avio_rl32(pb);
> + w->width = (flags & 0x3FFF) + 1;
> + w->height = ((flags >> 14) & 0x3FFF) + 1;
> + w->alpha = (flags >> 28) & 0x01;
> +
> + ret = avio_seek(pb, -25, SEEK_CUR);
> + } else {
> + ret = avio_skip(pb, 6);
> + if (ret < 0)
> + return ret;
> +
> + w->width = (avio_rl16(pb) & 0x3FFF);
> + w->height = (avio_rl16(pb) & 0x3FFF);
> +
> + ret = avio_seek(pb, -30, SEEK_CUR);
> + }
> +
> + return ret;
> +}
> +
> +static int webpdec_read_header(AVFormatContext * s)
> +{
> + WebpDemuxContext *w = s->priv_data;
> + AVStream *st;
> + int ret;
> +
> + w->width = -1;
> + w->height = -1;
> + w->loop = -1;
> + w->read_webp_header = 0;
> +
> + ret = parse_header(s);
> + if (ret < 0)
> + return ret;
> +
> + st = avformat_new_stream(s, NULL);
> + if (!st)
> + return AVERROR(ENOMEM);
> +
> + st->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
> + st->codecpar->codec_id = AV_CODEC_ID_WEBP;
> + st->codecpar->width = w->width;
> + st->codecpar->height = w->height;
> + st->codecpar->format = w->alpha ? AV_PIX_FMT_YUVA420P : AV_PIX_FMT_YUV420P;
> +
> + st->start_time = 0;
> +
> + avpriv_set_pts_info(st, 24, 1, 1000);
> +
> + return 0;
> +}
> +
> +static int webpdec_read_packet(AVFormatContext * s, AVPacket * pkt)
> +{
> + WebpDemuxContext *w = s->priv_data;
> + AVIOContext *pb = s->pb;
> + int ret;
> +
> + ret = avio_feof(pb);
> + if (ret < 0)
> + return ret;
> + else if (ret > 0)
> + return AVERROR_EOF;
> +
> + if (!w->read_webp_header) {
> + s->packet_size = w->vp8x ? 30 : 12;
> +
> + ret = av_get_packet(pb, pkt, s->packet_size);
> + if (ret < 0)
> + return ret;
> +
> + w->read_webp_header = 1;
> +
> + return 0;
> + }
> +
> + if (!w->using_webp_anim_decoder)
> + pkt->duration =
> + av_rescale_q(33, (AVRational) { 1, 1000 },
> + s->streams[0]->time_base);
> +
> + if (w->vp8x) {
> + ret = parse_vp8x_chunk(s, pkt);
> + if (ret < 0)
> + return ret;
> + } else {
> + int fourcc = avio_rl32(pb);
> + int size = avio_rl32(pb) + 8;
> + size = size + (size & 1);
> + if (fourcc != AV_RL32("VP8 ") && fourcc != AV_RL32("VP8L"))
> + return AVERROR_INVALIDDATA;
> + ret = avio_seek(pb, -8, SEEK_CUR);
> + if (ret < 0)
> + return ret;
> + ret = av_get_packet(pb, pkt, size);
> + if (ret < 0)
> + return ret;
> + }
> +
> + pkt->stream_index = 0;
> +
> + return 0;
> +}
> +
> +static const AVClass webp_demuxer_class = {
> + .class_name = "WebP demuxer",
> + .item_name = av_default_item_name,
> + .version = LIBAVUTIL_VERSION_INT,
> +};
An AVClass is only necessary when this demuxer takes options; so it is
unnecessary in this case.
> +
> +const AVInputFormat ff_webp_demuxer = {
> + .name = "webp",
> + .long_name = NULL_IF_CONFIG_SMALL("WebP"),
> + .extensions = "webp",
> + .mime_type = "image/webp",
> + .priv_data_size = sizeof(WebpDemuxContext),
> + .read_probe = webpdec_read_probe,
> + .read_header = webpdec_read_header,
> + .read_packet = webpdec_read_packet,
> + .priv_class = &webp_demuxer_class,
> + .flags = AVFMT_VARIABLE_FPS,
Align on '='
> +};
> diff --git a/libavformat/webpenc.c b/libavformat/webpenc.c
> index 9599fe7b85..47c28437ff 100644
> --- a/libavformat/webpenc.c
> +++ b/libavformat/webpenc.c
> @@ -55,13 +55,18 @@ static int is_animated_webp_packet(AVPacket *pkt)
> {
> int skip = 0;
> unsigned flags = 0;
> + int fourcc = AV_RL32(pkt->data);
>
> if (pkt->size < 4)
> return AVERROR_INVALIDDATA;
> - if (AV_RL32(pkt->data) == AV_RL32("RIFF"))
> + if (fourcc == AV_RL32("RIFF"))
> skip = 12;
> + else if (fourcc == AV_RL32("ANIM"))
> + return 1;
> + else if (fourcc == AV_RL32("ANMF"))
> + return 1;
> // Safe to do this as a valid WebP bitstream is >=30 bytes.
> - if (pkt->size < skip + 4)
> + if (pkt->size < skip + 4 && pkt->size != 12)
> return AVERROR_INVALIDDATA;
> if (AV_RL32(pkt->data + skip) == AV_RL32("VP8X")) {
> flags |= pkt->data[skip + 4 + 4];
> @@ -143,6 +148,7 @@ static int flush(AVFormatContext *s, int trailer, int64_t pts)
> static int webp_write_packet(AVFormatContext *s, AVPacket *pkt)
> {
> WebpContext *w = s->priv_data;
> + int fourcc = AV_RL32(pkt->data);
> int ret;
>
> if (!pkt->size)
> @@ -161,7 +167,8 @@ static int webp_write_packet(AVFormatContext *s, AVPacket *pkt)
> return ret;
> av_packet_ref(&w->last_pkt, pkt);
> }
> - ++w->frame_count;
> + if (fourcc == AV_RL32("ANMF") || fourcc == AV_RL32("VP8 ") || fourcc == AV_RL32("VP8L"))
> + ++w->frame_count;
>
> return 0;
> }
>
The changes to webpenc should be in a separate patch (and they also need
an explanation).
- Andreas
More information about the ffmpeg-devel
mailing list