[FFmpeg-devel] [PATCH v5 1/1] avformat: Add IPFS protocol support.
Mark Gaiser
markg85 at gmail.com
Sat Feb 12 19:57:31 EET 2022
On Sat, Feb 12, 2022 at 1:19 PM Tomas Härdin <tjoppen at acc.umu.se> wrote:
> tor 2022-02-10 klockan 02:13 +0100 skrev Mark Gaiser:
> > This patch adds support for:
> > - ffplay ipfs://<cid>
> > - ffplay ipns://<cid>
> >
> > IPFS data can be played from so called "ipfs gateways".
> > A gateway is essentially a webserver that gives access to the
> > distributed IPFS network.
> >
> > This protocol support (ipfs and ipns) therefore translates
> > ipfs:// and ipns:// to a http:// url. This resulting url is
> > then handled by the http protocol. It could also be https
> > depending on the gateway provided.
> >
> > To use this protocol, a gateway must be provided.
> > If you do nothing it will try to find it in your
> > $HOME/.ipfs/gateway file. The ways to set it manually are:
> > 1. Define a -gateway <url> to the gateway.
> > 2. Define $IPFS_GATEWAY with the full http link to the gateway.
> > 3. Define $IPFS_PATH and point it to the IPFS data path.
> > 4. Have IPFS running in your local user folder (under $HOME/.ipfs).
> >
> > Signed-off-by: Mark Gaiser <markg85 at gmail.com>
> > ---
> > configure | 2 +
> > doc/protocols.texi | 30 ++++
> > libavformat/Makefile | 2 +
> > libavformat/ipfsgateway.c | 326
> > ++++++++++++++++++++++++++++++++++++++
> > libavformat/protocols.c | 2 +
> > 5 files changed, 362 insertions(+)
> > create mode 100644 libavformat/ipfsgateway.c
> >
> > diff --git a/configure b/configure
> > index 5b19a35f59..6ff09e7974 100755
> > --- a/configure
> > +++ b/configure
> > @@ -3585,6 +3585,8 @@ udp_protocol_select="network"
> > udplite_protocol_select="network"
> > unix_protocol_deps="sys_un_h"
> > unix_protocol_select="network"
> > +ipfs_protocol_select="https_protocol"
> > +ipns_protocol_select="https_protocol"
> >
> > # external library protocols
> > libamqp_protocol_deps="librabbitmq"
> > diff --git a/doc/protocols.texi b/doc/protocols.texi
> > index d207df0b52..7c9c0a4808 100644
> > --- a/doc/protocols.texi
> > +++ b/doc/protocols.texi
> > @@ -2025,5 +2025,35 @@ decoding errors.
> >
> > @end table
> >
> > + at section ipfs
> > +
> > +InterPlanetary File System (IPFS) protocol support. One can access
> > files stored
> > +on the IPFS network through so called gateways. Those are http(s)
> > endpoints.
> > +This protocol wraps the IPFS native protocols (ipfs:// and ipns://)
> > to be send
> > +to such a gateway. Users can (and should) host their own node which
> > means this
> > +protocol will use your local machine gateway to access files on the
> > IPFS network.
> > +
> > +If a user doesn't have a node of their own then the public gateway
> > dweb.link is
> > +used by default.
> > +
> > +You can use this protocol in 2 ways. Using IPFS:
> > + at example
> > +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T
> > + at end example
> > +
> > +Or the IPNS protocol (IPNS is mutable IPFS):
> > + at example
> > +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T
> > + at end example
> > +
> > +You can also change the gateway to be used:
> > +
> > + at table @option
> > +
> > + at item gateway
> > +Defines the gateway to use. When nothing is provided the protocol
> > will first try
> > +your local gateway. If that fails dweb.link will be used.
> > +
> > + at end table
> >
> > @c man end PROTOCOLS
> > diff --git a/libavformat/Makefile b/libavformat/Makefile
> > index 3dc6a479cc..4edce8420f 100644
> > --- a/libavformat/Makefile
> > +++ b/libavformat/Makefile
> > @@ -656,6 +656,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) +=
> > srtpproto.o srtp.o
> > OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o
> > OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o tee_common.o
> > OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o
> > +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o
> > +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o
> > TLS-OBJS-$(CONFIG_GNUTLS) += tls_gnutls.o
> > TLS-OBJS-$(CONFIG_LIBTLS) += tls_libtls.o
> > TLS-OBJS-$(CONFIG_MBEDTLS) += tls_mbedtls.o
> > diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c
> > new file mode 100644
> > index 0000000000..9ebced10c7
> > --- /dev/null
> > +++ b/libavformat/ipfsgateway.c
> > @@ -0,0 +1,326 @@
> > +/*
> > + * IPFS and IPNS protocol support through IPFS Gateway.
> > + * Copyright (c) 2022 Mark Gaiser
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later
> > version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> > 02110-1301 USA
> > + */
> > +
> > +#include "avformat.h"
> > +#include "libavutil/avassert.h"
> > +#include "libavutil/avstring.h"
> > +#include "libavutil/internal.h"
> > +#include "libavutil/opt.h"
> > +#include "libavutil/tree.h"
> > +#include <fcntl.h>
> > +#if HAVE_IO_H
> > +#include <io.h>
> > +#endif
> > +#if HAVE_UNISTD_H
> > +#include <unistd.h>
> > +#endif
> > +#include "os_support.h"
> > +#include "url.h"
> > +#include <stdlib.h>
> > +#include <sys/stat.h>
> > +
> > +typedef struct IPFSGatewayContext {
> > + AVClass *class;
> > + URLContext *inner;
> > + char *gateway;
> > +} IPFSGatewayContext;
> > +
> > +// A best-effort way to find the IPFS gateway.
> > +// Only the most appropiate gateway is set. It's not actually
> > requested
> > +// (http call) to prevent a potential slowdown in startup. A
> > potential timeout
> > +// is handled by the HTTP protocol.
> > +static int populate_ipfs_gateway(URLContext *h, char *gateway)
> > +{
> > + char ipfs_full_data_folder[PATH_MAX];
> > + char ipfs_gateway_file[PATH_MAX];
> > + struct stat st;
> > + int stat_ret = 0;
> > + int ret = AVERROR(EINVAL);
> > + FILE *gateway_file = NULL;
> > + char gateway_file_data[PATH_MAX];
> > +
> > + // We could already have a gateway (it would have been passed as
> > -gateway).
> > + if (*gateway != '\0') {
> > + ret = 1;
> > + goto err;
> > + }
>
> Why even call this function if this is the case?
>
> > +
> > + // Test $IPFS_GATEWAY.
> > + if (getenv("IPFS_GATEWAY") != NULL) {
> > + snprintf(gateway, PATH_MAX, "%s", getenv("IPFS_GATEWAY"));
>
> Passing buffer length rather than assuming PATH_MAX would be better.
> That way it can be changed without this breaking.
>
> Or stick the buffer in IPFSGatewayContext and use sizeof()
>
Awesome! Didn't even consider the option yet. Will do.
>
> > + ret = 1;
> > + goto err;
> > + } else
> > + av_log(h, AV_LOG_DEBUG, "$IPFS_GATEWAY is empty.\n");
> > +
> > + // We need to know the IPFS folder to - eventually - read the
> > contents of
> > + // the "gateway" file which would tell us the gateway to use.
> > + if (getenv("IPFS_PATH") == NULL) {
> > + av_log(h, AV_LOG_DEBUG, "$IPFS_PATH is empty.\n");
> > +
> > + // Try via the home folder.
> > + if (getenv("HOME") == NULL) {
> > + av_log(h, AV_LOG_ERROR, "$HOME appears to be empty.\n");
> > + ret = AVERROR(EINVAL);
> > + goto err;
> > + }
> > +
> > + // Verify the composed path fits.
> > + if (snprintf(ipfs_full_data_folder, PATH_MAX, "%s/.ipfs/",
>
> sizeof(ipfs_full_data_folder)
> This goes for most PATH_MAX in here
>
Yeah, i kinda thought using PATH_MAX everywhere would be consistent.
Flipping all of those to sizeof(...)
>
> > + getenv("HOME")) > PATH_MAX) {
> > + av_log(h, AV_LOG_ERROR, "The IPFS data path exceeds the
> > max path length (%i)\n", PATH_MAX);
> > + ret = AVERROR(EINVAL);
> > + goto err;
> > + }
> > +
> > + // Stat the folder.
> > + // It should exist in a default IPFS setup when run as local
> > user.
> > +#ifndef _WIN32
> > + stat_ret = stat(ipfs_full_data_folder, &st);
> > +#else
> > + stat_ret = win32_stat(ipfs_full_data_folder, &st);
> > +#endif
> > + if (stat_ret < 0) {
> > + av_log(h, AV_LOG_INFO, "Unable to find IPFS folder. We
> > tried:\n");
> > + av_log(h, AV_LOG_INFO, "- $IPFS_PATH, which was
> > empty.\n");
> > + av_log(h, AV_LOG_INFO, "- $HOME/.ipfs (full uri: %s)
> > which doesn't exist.\n", ipfs_full_data_folder);
> > + ret = AVERROR(ENOENT);
> > + goto err;
> > + }
>
> Right. This makes more sense
>
> > + } else
> > + snprintf(ipfs_full_data_folder, PATH_MAX, "%s",
> > getenv("IPFS_PATH"));
> > +
> > + // Copy the fully composed gateway path into ipfs_gateway_file.
> > + if (snprintf(ipfs_gateway_file, PATH_MAX, "%sgateway",
> > + ipfs_full_data_folder) > PATH_MAX) {
> > + av_log(h, AV_LOG_ERROR, "The IPFS gateway file path exceeds
> > the max path length (%i)\n", PATH_MAX);
> > + ret = AVERROR(ENOENT);
> > + goto err;
> > + }
> > +
> > + // Get the contents of the gateway file.
> > + gateway_file = av_fopen_utf8(ipfs_gateway_file, "r");
> > + if (!gateway_file) {
> > + av_log(h, AV_LOG_ERROR, "The IPFS gateway file (full uri:
> > %s) doesn't exist. Is the gateway enabled?\n", ipfs_gateway_file);
> > + ret = AVERROR(ENOENT);
> > + goto err;
> > + }
> > +
> > + // Read a single line (fgets stops at new line mark).
> > + fgets(gateway_file_data, PATH_MAX - 1, gateway_file);
> > +
> > + // Replace the last char with \0
> > + gateway_file_data[PATH_MAX - 1] = 0;
> > +
> > + // Replace first occurence of end of line with \0
> > + gateway_file_data[strcspn(gateway_file_data, "\r\n")] = 0;
>
> Won't this be just \n on Unix-like systems? Could be two lines, one
> with strcspn(.., "\n") and the other with "\r".
>
The current logic works with linux.
Splitting it up regardless, it can't hurt and might catch some nasty edge
cases.
>
> > +
> > + // If strlen finds anything longer then 0 characters then we
> > have a
> > + // potential gateway url.
> > + if (strlen(gateway_file_data) < 1) {
> > + av_log(h, AV_LOG_ERROR, "The IPFS gateway file (full uri:
> > %s) appears to be empty. Is the gateway started?\n",
> > ipfs_gateway_file);
> > + ret = AVERROR(EILSEQ);
> > + goto err;
> > + }
> > +
> > + // At this point gateway_file_data contains at least something.
> > + // Copy it into gateway.
> > + if (snprintf(gateway, PATH_MAX, "%s", gateway_file_data) > 0) {
> > + ret = 1;
> > + goto err;
> > + } else
> > + av_log(h, AV_LOG_DEBUG, "Unknown error in the IPFS gateway
> > file.\n");
> > +
> > +err:
> > + if (gateway_file)
> > + fclose(gateway_file);
> > +
> > + return ret;
> > +}
> > +
> > +// For now just makes sure that the gateway ends in url we expect.
> > +// Like http://localhost:8080/.
> > +// Explicitly with the traling slash.
> > +static int sanitize_ipfs_gateway(URLContext *h, char *gateway)
> > +{
> > + const char *url_without_protocol;
> > + int ret = 1;
> > +
> > + // Test if the gateway starts with either http:// or https://
> > + // The remainder is stored in url_without_protocol
> > + if (av_stristart(gateway, "http://", &url_without_protocol) == 0
> > + && av_stristart(gateway, "https://", &url_without_protocol)
> > == 0) {
> > + av_log(h, AV_LOG_ERROR, "The gateway URL didn't start with
> > http:// or https:// and is therefore invalid.\n");
> > + ret = AVERROR(EILSEQ);
> > + goto err;
> > + }
> > +
> > + // We now know the remainder of the url without the protocol.
> > Check it for
> > + // some length. At least 1 character.
> > + if (strlen(url_without_protocol) < 1) {
> > + av_log(h, AV_LOG_ERROR, "The gateway url (without the
> > protocol part) is too short to be a valid URL.\n");
> > + ret = AVERROR(EILSEQ);
> > + goto err;
> > + }
>
> I think we can just rely on the http protocol stuff to deal with this
>
Do you mind if I keep it in? It doesn't seem to hurt and again does help in
debugging scenarios.
>
> > +
> > +
> > + if (gateway[strlen(gateway) - 1] != '/') {
> > + // Check if we have enough room to add a '/'
> > + // We already know that it's bigger then 0 because that's
> > handled
> > + // in populate_ipfs_gateway
> > +
> > + if (strlen(gateway) < (PATH_MAX - 2)) {
> > + // We have room. Get the length and add a '/' and a '\0'
> > + int gateway_length = strlen(gateway);
> > + gateway[gateway_length] = '/';
> > + gateway[gateway_length + 1] = '\0';
> > + } else
> > + av_log(h, AV_LOG_ERROR, "The gateway url is longer then
> > the allowed max path length (%i).\n", PATH_MAX);
> > +
> > + ret = 1;
> > + goto err;
> > + }
>
> Rolling this into the snprintf() would be prettier
>
Ugh :P
Well, I did that before I changed it to this. It worked but it threw a
compile warning about the resulting size truncating the gateway variable.
This seemed simpler than checking for it. But then again, I am now checking
for that length in other places too so I might as well do that here too.
>
> > +
> > +err:
> > + return ret;
> > +}
> > +
> > +static int translate_ipfs_to_http(URLContext *h, const char *uri,
> > + int flags, AVDictionary **options)
> > +{
> > + const char *ipfs_cid;
> > + char *fulluri = NULL;
> > + char ipfs_gateway[PATH_MAX];
> > + int ret;
> > + IPFSGatewayContext *c = h->priv_data;
> > +
> > + // Test for ipfs://, ipfs:, ipns:// and ipns:. This prefix is
> > stripped from
> > + // the string leaving just the CID in ipfs_cid.
> > + int is_ipfs = av_stristart(uri, "ipfs://", &ipfs_cid);
> > + int is_ipns = av_stristart(uri, "ipns://", &ipfs_cid);
> > +
> > + // We must have either ipns or ipfs.
> > + if (!is_ipfs && !is_ipns) {
> > + ret = AVERROR(EINVAL);
> > + av_log(h, AV_LOG_ERROR, "Unsupported url %s\n", uri);
> > + goto err;
> > + }
>
> Shouldn't this check that the URL is precisely what the protocol
> expects? ipfs for ff_ipfs_protocol and ipns for ff_ipns_protocol. Maybe
> that's validated further up however..
>
Well, the protocol has 2 potential ways:
<gateway>/<protocol>/<ipfs cid> (existed since the start of IPFS)
and
<ipfs cid>.<protocol>.<gateway> (called subdomain gateway)
An example of both:
http://localhost:8080/ipfs/bafybeigagd5nmnn2iys2f3doro7ydrevyr2mzarwidgadawmamiteydbzi
http://bafybeigagd5nmnn2iys2f3doro7ydrevyr2mzarwidgadawmamiteydbzi.ipfs.localhost:8080
Both are valid but the former is easier.
The latter really mostly exists for website/browser purposes [1].
Therefore I like to use the former because it's much easier. It only
requires a geteway provided by the user.
And after that it's only a matter of appending strings. The alternative
option is more tricky to get right.
Regardless, the way I concatenate the strings now is how I construct a
valid url.
If that URL isn't valid then there's likely a gateway permission issue. But
that's outside the scope of this protocol and likely something the http(s)
protocol informs you about anyhow.
Anyhow, thank you again for the review!
I'll send a V6 later this evening.
Are we close to merging it? ;)
[1] https://codeclimbing.com/why-i-am-excited-about-subdomain-ipfs-gateways/
> /Tomas
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
More information about the ffmpeg-devel
mailing list