[FFmpeg-devel] [PATCH v5 1/1] avformat: Add IPFS protocol support.

Mark Gaiser markg85 at gmail.com
Sat Feb 12 19:57:31 EET 2022


On Sat, Feb 12, 2022 at 1:19 PM Tomas Härdin <tjoppen at acc.umu.se> wrote:

> tor 2022-02-10 klockan 02:13 +0100 skrev Mark Gaiser:
> > This patch adds support for:
> > - ffplay ipfs://<cid>
> > - ffplay ipns://<cid>
> >
> > IPFS data can be played from so called "ipfs gateways".
> > A gateway is essentially a webserver that gives access to the
> > distributed IPFS network.
> >
> > This protocol support (ipfs and ipns) therefore translates
> > ipfs:// and ipns:// to a http:// url. This resulting url is
> > then handled by the http protocol. It could also be https
> > depending on the gateway provided.
> >
> > To use this protocol, a gateway must be provided.
> > If you do nothing it will try to find it in your
> > $HOME/.ipfs/gateway file. The ways to set it manually are:
> > 1. Define a -gateway <url> to the gateway.
> > 2. Define $IPFS_GATEWAY with the full http link to the gateway.
> > 3. Define $IPFS_PATH and point it to the IPFS data path.
> > 4. Have IPFS running in your local user folder (under $HOME/.ipfs).
> >
> > Signed-off-by: Mark Gaiser <markg85 at gmail.com>
> > ---
> >  configure                 |   2 +
> >  doc/protocols.texi        |  30 ++++
> >  libavformat/Makefile      |   2 +
> >  libavformat/ipfsgateway.c | 326
> > ++++++++++++++++++++++++++++++++++++++
> >  libavformat/protocols.c   |   2 +
> >  5 files changed, 362 insertions(+)
> >  create mode 100644 libavformat/ipfsgateway.c
> >
> > diff --git a/configure b/configure
> > index 5b19a35f59..6ff09e7974 100755
> > --- a/configure
> > +++ b/configure
> > @@ -3585,6 +3585,8 @@ udp_protocol_select="network"
> >  udplite_protocol_select="network"
> >  unix_protocol_deps="sys_un_h"
> >  unix_protocol_select="network"
> > +ipfs_protocol_select="https_protocol"
> > +ipns_protocol_select="https_protocol"
> >
> >  # external library protocols
> >  libamqp_protocol_deps="librabbitmq"
> > diff --git a/doc/protocols.texi b/doc/protocols.texi
> > index d207df0b52..7c9c0a4808 100644
> > --- a/doc/protocols.texi
> > +++ b/doc/protocols.texi
> > @@ -2025,5 +2025,35 @@ decoding errors.
> >
> >  @end table
> >
> > + at section ipfs
> > +
> > +InterPlanetary File System (IPFS) protocol support. One can access
> > files stored
> > +on the IPFS network through so called gateways. Those are http(s)
> > endpoints.
> > +This protocol wraps the IPFS native protocols (ipfs:// and ipns://)
> > to be send
> > +to such a gateway. Users can (and should) host their own node which
> > means this
> > +protocol will use your local machine gateway to access files on the
> > IPFS network.
> > +
> > +If a user doesn't have a node of their own then the public gateway
> > dweb.link is
> > +used by default.
> > +
> > +You can use this protocol in 2 ways. Using IPFS:
> > + at example
> > +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T
> > + at end example
> > +
> > +Or the IPNS protocol (IPNS is mutable IPFS):
> > + at example
> > +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T
> > + at end example
> > +
> > +You can also change the gateway to be used:
> > +
> > + at table @option
> > +
> > + at item gateway
> > +Defines the gateway to use. When nothing is provided the protocol
> > will first try
> > +your local gateway. If that fails dweb.link will be used.
> > +
> > + at end table
> >
> >  @c man end PROTOCOLS
> > diff --git a/libavformat/Makefile b/libavformat/Makefile
> > index 3dc6a479cc..4edce8420f 100644
> > --- a/libavformat/Makefile
> > +++ b/libavformat/Makefile
> > @@ -656,6 +656,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL)             +=
> > srtpproto.o srtp.o
> >  OBJS-$(CONFIG_SUBFILE_PROTOCOL)          += subfile.o
> >  OBJS-$(CONFIG_TEE_PROTOCOL)              += teeproto.o tee_common.o
> >  OBJS-$(CONFIG_TCP_PROTOCOL)              += tcp.o
> > +OBJS-$(CONFIG_IPFS_PROTOCOL)             += ipfsgateway.o
> > +OBJS-$(CONFIG_IPNS_PROTOCOL)             += ipfsgateway.o
> >  TLS-OBJS-$(CONFIG_GNUTLS)                += tls_gnutls.o
> >  TLS-OBJS-$(CONFIG_LIBTLS)                += tls_libtls.o
> >  TLS-OBJS-$(CONFIG_MBEDTLS)               += tls_mbedtls.o
> > diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c
> > new file mode 100644
> > index 0000000000..9ebced10c7
> > --- /dev/null
> > +++ b/libavformat/ipfsgateway.c
> > @@ -0,0 +1,326 @@
> > +/*
> > + * IPFS and IPNS protocol support through IPFS Gateway.
> > + * Copyright (c) 2022 Mark Gaiser
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later
> > version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> > 02110-1301 USA
> > + */
> > +
> > +#include "avformat.h"
> > +#include "libavutil/avassert.h"
> > +#include "libavutil/avstring.h"
> > +#include "libavutil/internal.h"
> > +#include "libavutil/opt.h"
> > +#include "libavutil/tree.h"
> > +#include <fcntl.h>
> > +#if HAVE_IO_H
> > +#include <io.h>
> > +#endif
> > +#if HAVE_UNISTD_H
> > +#include <unistd.h>
> > +#endif
> > +#include "os_support.h"
> > +#include "url.h"
> > +#include <stdlib.h>
> > +#include <sys/stat.h>
> > +
> > +typedef struct IPFSGatewayContext {
> > +    AVClass *class;
> > +    URLContext *inner;
> > +    char *gateway;
> > +} IPFSGatewayContext;
> > +
> > +// A best-effort way to find the IPFS gateway.
> > +// Only the most appropiate gateway is set. It's not actually
> > requested
> > +// (http call) to prevent a potential slowdown in startup. A
> > potential timeout
> > +// is handled by the HTTP protocol.
> > +static int populate_ipfs_gateway(URLContext *h, char *gateway)
> > +{
> > +    char ipfs_full_data_folder[PATH_MAX];
> > +    char ipfs_gateway_file[PATH_MAX];
> > +    struct stat st;
> > +    int stat_ret = 0;
> > +    int ret = AVERROR(EINVAL);
> > +    FILE *gateway_file = NULL;
> > +    char gateway_file_data[PATH_MAX];
> > +
> > +    // We could already have a gateway (it would have been passed as
> > -gateway).
> > +    if (*gateway != '\0') {
> > +        ret = 1;
> > +        goto err;
> > +    }
>
> Why even call this function if this is the case?
>
> > +
> > +    // Test $IPFS_GATEWAY.
> > +    if (getenv("IPFS_GATEWAY") != NULL) {
> > +        snprintf(gateway, PATH_MAX, "%s", getenv("IPFS_GATEWAY"));
>
> Passing buffer length rather than assuming PATH_MAX would be better.
> That way it can be changed without this breaking.
>
> Or stick the buffer in IPFSGatewayContext and use sizeof()
>

Awesome! Didn't even consider the option yet. Will do.

>
> > +        ret = 1;
> > +        goto err;
> > +    } else
> > +        av_log(h, AV_LOG_DEBUG, "$IPFS_GATEWAY is empty.\n");
> > +
> > +    // We need to know the IPFS folder to - eventually - read the
> > contents of
> > +    // the "gateway" file which would tell us the gateway to use.
> > +    if (getenv("IPFS_PATH") == NULL) {
> > +        av_log(h, AV_LOG_DEBUG, "$IPFS_PATH is empty.\n");
> > +
> > +        // Try via the home folder.
> > +        if (getenv("HOME") == NULL) {
> > +            av_log(h, AV_LOG_ERROR, "$HOME appears to be empty.\n");
> > +            ret = AVERROR(EINVAL);
> > +            goto err;
> > +        }
> > +
> > +        // Verify the composed path fits.
> > +        if (snprintf(ipfs_full_data_folder, PATH_MAX, "%s/.ipfs/",
>
> sizeof(ipfs_full_data_folder)
> This goes for most PATH_MAX in here
>

Yeah, i kinda thought using PATH_MAX everywhere would be consistent.
Flipping all of those to sizeof(...)

>
> > +                     getenv("HOME")) > PATH_MAX) {
> > +            av_log(h, AV_LOG_ERROR, "The IPFS data path exceeds the
> > max path length (%i)\n", PATH_MAX);
> > +            ret = AVERROR(EINVAL);
> > +            goto err;
> > +        }
> > +
> > +        // Stat the folder.
> > +        // It should exist in a default IPFS setup when run as local
> > user.
> > +#ifndef _WIN32
> > +        stat_ret = stat(ipfs_full_data_folder, &st);
> > +#else
> > +        stat_ret = win32_stat(ipfs_full_data_folder, &st);
> > +#endif
> > +        if (stat_ret < 0) {
> > +            av_log(h, AV_LOG_INFO, "Unable to find IPFS folder. We
> > tried:\n");
> > +            av_log(h, AV_LOG_INFO, "- $IPFS_PATH, which was
> > empty.\n");
> > +            av_log(h, AV_LOG_INFO, "- $HOME/.ipfs (full uri: %s)
> > which doesn't exist.\n", ipfs_full_data_folder);
> > +            ret = AVERROR(ENOENT);
> > +            goto err;
> > +        }
>
> Right. This makes more sense
>
> > +    } else
> > +        snprintf(ipfs_full_data_folder, PATH_MAX, "%s",
> > getenv("IPFS_PATH"));
> > +
> > +    // Copy the fully composed gateway path into ipfs_gateway_file.
> > +    if (snprintf(ipfs_gateway_file, PATH_MAX, "%sgateway",
> > +                 ipfs_full_data_folder) > PATH_MAX) {
> > +        av_log(h, AV_LOG_ERROR, "The IPFS gateway file path exceeds
> > the max path length (%i)\n", PATH_MAX);
> > +        ret = AVERROR(ENOENT);
> > +        goto err;
> > +    }
> > +
> > +    // Get the contents of the gateway file.
> > +    gateway_file = av_fopen_utf8(ipfs_gateway_file, "r");
> > +    if (!gateway_file) {
> > +        av_log(h, AV_LOG_ERROR, "The IPFS gateway file (full uri:
> > %s) doesn't exist. Is the gateway enabled?\n", ipfs_gateway_file);
> > +        ret = AVERROR(ENOENT);
> > +        goto err;
> > +    }
> > +
> > +    // Read a single line (fgets stops at new line mark).
> > +    fgets(gateway_file_data, PATH_MAX - 1, gateway_file);
> > +
> > +    // Replace the last char with \0
> > +    gateway_file_data[PATH_MAX - 1] = 0;
> > +
> > +    // Replace first occurence of end of line with \0
> > +    gateway_file_data[strcspn(gateway_file_data, "\r\n")] = 0;
>
> Won't this be just \n on Unix-like systems? Could be two lines, one
> with strcspn(.., "\n") and the other with "\r".
>

The current logic works with linux.
Splitting it up regardless, it can't hurt and might catch some nasty edge
cases.

>
> > +
> > +    // If strlen finds anything longer then 0 characters then we
> > have a
> > +    // potential gateway url.
> > +    if (strlen(gateway_file_data) < 1) {
> > +        av_log(h, AV_LOG_ERROR, "The IPFS gateway file (full uri:
> > %s) appears to be empty. Is the gateway started?\n",
> > ipfs_gateway_file);
> > +        ret = AVERROR(EILSEQ);
> > +        goto err;
> > +    }
> > +
> > +    // At this point gateway_file_data contains at least something.
> > +    // Copy it into gateway.
> > +    if (snprintf(gateway, PATH_MAX, "%s", gateway_file_data) > 0) {
> > +        ret = 1;
> > +        goto err;
> > +    } else
> > +        av_log(h, AV_LOG_DEBUG, "Unknown error in the IPFS gateway
> > file.\n");
> > +
> > +err:
> > +    if (gateway_file)
> > +        fclose(gateway_file);
> > +
> > +    return ret;
> > +}
> > +
> > +// For now just makes sure that the gateway ends in url we expect.
> > +// Like http://localhost:8080/.
> > +// Explicitly with the traling slash.
> > +static int sanitize_ipfs_gateway(URLContext *h, char *gateway)
> > +{
> > +    const char *url_without_protocol;
> > +    int ret = 1;
> > +
> > +    // Test if the gateway starts with either http:// or https://
> > +    // The remainder is stored in url_without_protocol
> > +    if (av_stristart(gateway, "http://", &url_without_protocol) == 0
> > +        && av_stristart(gateway, "https://", &url_without_protocol)
> > == 0) {
> > +        av_log(h, AV_LOG_ERROR, "The gateway URL didn't start with
> > http:// or https:// and is therefore invalid.\n");
> > +        ret = AVERROR(EILSEQ);
> > +        goto err;
> > +    }
> > +
> > +    // We now know the remainder of the url without the protocol.
> > Check it for
> > +    // some length. At least 1 character.
> > +    if (strlen(url_without_protocol) < 1) {
> > +        av_log(h, AV_LOG_ERROR, "The gateway url (without the
> > protocol part) is too short to be a valid URL.\n");
> > +        ret = AVERROR(EILSEQ);
> > +        goto err;
> > +    }
>
> I think we can just rely on the http protocol stuff to deal with this
>

Do you mind if I keep it in? It doesn't seem to hurt and again does help in
debugging scenarios.


>
> > +
> > +
> > +    if (gateway[strlen(gateway) - 1] != '/') {
> > +        // Check if we have enough room to add a '/'
> > +        // We already know that it's bigger then 0 because that's
> > handled
> > +        // in populate_ipfs_gateway
> > +
> > +        if (strlen(gateway) < (PATH_MAX - 2)) {
> > +            // We have room. Get the length and add a '/' and a '\0'
> > +            int gateway_length = strlen(gateway);
> > +            gateway[gateway_length] = '/';
> > +            gateway[gateway_length + 1] = '\0';
> > +        } else
> > +            av_log(h, AV_LOG_ERROR, "The gateway url is longer then
> > the allowed max path length (%i).\n", PATH_MAX);
> > +
> > +        ret = 1;
> > +        goto err;
> > +    }
>
> Rolling this into the snprintf() would be prettier
>

Ugh :P
Well, I did that before I changed it to this. It worked but it threw a
compile warning about the resulting size truncating the gateway variable.
This seemed simpler than checking for it. But then again, I am now checking
for that length in other places too so I might as well do that here too.


>
> > +
> > +err:
> > +    return ret;
> > +}
> > +
> > +static int translate_ipfs_to_http(URLContext *h, const char *uri,
> > +                                  int flags, AVDictionary **options)
> > +{
> > +    const char *ipfs_cid;
> > +    char *fulluri = NULL;
> > +    char ipfs_gateway[PATH_MAX];
> > +    int ret;
> > +    IPFSGatewayContext *c = h->priv_data;
> > +
> > +    // Test for ipfs://, ipfs:, ipns:// and ipns:. This prefix is
> > stripped from
> > +    // the string leaving just the CID in ipfs_cid.
> > +    int is_ipfs = av_stristart(uri, "ipfs://", &ipfs_cid);
> > +    int is_ipns = av_stristart(uri, "ipns://", &ipfs_cid);
> > +
> > +    // We must have either ipns or ipfs.
> > +    if (!is_ipfs && !is_ipns) {
> > +        ret = AVERROR(EINVAL);
> > +        av_log(h, AV_LOG_ERROR, "Unsupported url %s\n", uri);
> > +        goto err;
> > +    }
>
> Shouldn't this check that the URL is precisely what the protocol
> expects? ipfs for ff_ipfs_protocol and ipns for ff_ipns_protocol. Maybe
> that's validated further up however..
>

Well, the protocol has 2 potential ways:
<gateway>/<protocol>/<ipfs cid> (existed since the start of IPFS)
and
<ipfs cid>.<protocol>.<gateway> (called subdomain gateway)

An example of both:
http://localhost:8080/ipfs/bafybeigagd5nmnn2iys2f3doro7ydrevyr2mzarwidgadawmamiteydbzi
http://bafybeigagd5nmnn2iys2f3doro7ydrevyr2mzarwidgadawmamiteydbzi.ipfs.localhost:8080

Both are valid but the former is easier.
The latter really mostly exists for website/browser purposes [1].
Therefore I like to use the former because it's much easier. It only
requires a geteway provided by the user.
And after that it's only a matter of appending strings. The alternative
option is more tricky to get right.
Regardless, the way I concatenate the strings now is how I construct a
valid url.

If that URL isn't valid then there's likely a gateway permission issue. But
that's outside the scope of this protocol and likely something the http(s)
protocol informs you about anyhow.

Anyhow, thank you again for the review!
I'll send a V6 later this evening.
Are we close to merging it? ;)

[1] https://codeclimbing.com/why-i-am-excited-about-subdomain-ipfs-gateways/


> /Tomas
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>


More information about the ffmpeg-devel mailing list