[FFmpeg-devel] [PATCH v4 1/1] avformat: Add IPFS protocol support.

Mark Gaiser markg85 at gmail.com
Fri Feb 4 16:12:54 EET 2022


On Fri, Feb 4, 2022 at 12:10 PM Tomas Härdin <tjoppen at acc.umu.se> wrote:

> tor 2022-02-03 klockan 18:29 +0100 skrev Mark Gaiser:
> >
> > +typedef struct IPFSGatewayContext {
> > +    AVClass *class;
> > +    URLContext *inner;
> > +    char *gateway;
>
> Consider two separate variables. One for AVOption and one for the
> dynamically allocated string. Or put the latter on the stack.
>

There always needs to be a gateway so why is reusing that variable an issue?
I'm fine splitting it up but I'd like to understand the benefit of it as
currently I don't see that benefit.


> > +} IPFSGatewayContext;
> > +
> > +// A best-effort way to find the IPFS gateway.
> > +// Only the most appropiate gateway is set. It's not actually
> > requested
> > +// (http call) to prevent a potential slowdown in startup. A
> > potential timeout
> > +// is handled by the HTTP protocol.
> > +//
> > +// Return codes can be:
> > +// 1 : A potential gateway is found and set in c->gateway
> > +// -1: The IPFS data folder could not be found
> > +// -2: The gateway file could not be found
> > +// -3: The gateway file is found but empty
> > +// -4: $HOME is empty
> > +// -9: Unhandled error
>
> What Michael meant with better return codes is using AVERROR_* :)
>

Hey, first attempt at an ffmpeg contribution here. I didn't know that at
all. ;)
But yeah, will do!


> > +static int populate_ipfs_gateway(URLContext *h)
> > +{
> > +    IPFSGatewayContext *c = h->priv_data;
> > +    char *ipfs_full_data_folder = NULL;
> > +    char *ipfs_gateway_file = NULL;
>
> These can be char[PATH_MAX]
>

Oke, will do.
C code question though.
How do I use av_asprintf on stack arrays like that?

I kinda like to prevent using something different just to later figure out
that there was an av_ function that would be far better :)


> > +    struct stat st;
> > +    int stat_ret = 0;
> > +    int ret = -9;
> > +    FILE *gateway_file = NULL;
> > +    char gateway_file_data[1000];
>
> A maximum URL length of 999?
>

There doesn't seem to be a hard limit on url's..
It even seems to be browser dependent. Chrome apparently allows it up to
2MB, firefox up to 64k.
I'm just going to use PATH_MAX here too which seems plenty with 4096 bytes.
IPFS url's can be rather lengthy though so I do like to keep enough room
for it.


> > +
> > +    // First, test if there already is a path in c->gateway. If it
> > is then it
> > +    // was provided as cli arument and should be used. It takes
> > precdence.
> > +    if (c->gateway != NULL) {
> > +        ret = 1;
> > +        goto err;
> > +    }
> > +
> > +    // Test $IPFS_GATEWAY.
> > +    if (getenv("IPFS_GATEWAY") != NULL) {
> > +        av_free(c->gateway);
>
> Useless since c->gateway is NULL
>
> > +
> > +        // Stat the folder.
> > +        // It should exist in a default IPFS setup when run as local
> > user.
> > +#ifndef _WIN32
> > +        stat_ret = stat(ipfs_full_data_folder, &st);
> > +#else
> > +        stat_ret = win32_stat(ipfs_full_data_folder, &st);
> > +#endif
>
> Again, there is no reason to stat this. Just try opening the gateway
> file directly.
>

This is a folder, not a file.

The other stat that was here too was a file, I replaced that with an fopen.
It smells sketchy to me to (ab)use fopen to check if a folder exists.
There's stat for that.


>
> > +
> > +    // Read a single line (fgets stops at new line mark).
> > +    fgets(gateway_file_data, sizeof(gateway_file_data) - 1,
> > gateway_file);
>
> This can result in gateway_file_data not being NUL terminated


> > +
> > +    // Replace first occurence of end of line to \0
> > +    gateway_file_data[strcspn(gateway_file_data, "\r\n")] = 0;
>
> What if the file uses \n or no newlines at all?
>

Right.
So I guess the fix here is:
1. Initialize gateway_file_data so all bytes are zero
2. read a line
3. set the last byte of gateway_file_data to 0

Now any text in the string will be the gateway.

Is that a proper fix?


> > +err:
> > +    if (gateway_file)
> > +        fclose(gateway_file);
> > +
> > +    av_free(ipfs_full_data_folder);
> > +    av_free(ipfs_gateway_file);
>
> This is not cleaning up dynamic allocations of c->gateway
>

So I should do that in  ipfs_close, right?

>
> > +// -3: The gateway url part (without the protocol) is too short. We
> > expect 3
> > +//     characters minimal. So http://aaa would be the bare minimal.
>
> http://1 is valid I think. It means http://0.0.0.1


Right, 1 character it is.
I thought I'd go for a common sense approach as in "it can't possibly be
smaller than..."
But I suppose just forcing any character to be there is fine too.

>
>
> > +    // Test if the gateway starts with either http:// or https://
> > +    // The remainder is stored in url_without_protocol
> > +    if (av_stristart(uri, "http://", &url_without_protocol) == 0
> > +        && av_stristart(uri, "https://", &url_without_protocol) ==
> > 0) {
> > +        av_log(h, AV_LOG_ERROR, "The gateway URL didn't start with
> > http:// or https:// and is therefore invalid.\n");
> > +        ret = -2;
> > +        goto err;
> > +    }
>
> I guess restricting this to HTTP schemes is OK. Or are there non-HTTP
> gateways for this?
>

No.
At least not from the IPFS camp.
The IPFS software creates a gateway and that is specifically an http
gateway.
Users can put that behind a proxy making it (potentially) a https gateway
but that's about it.


>
> > +    if (last_gateway_char != '/') {
> > +        c->gateway = av_asprintf("%s/", c->gateway);
>
> Yet another leak
>

Please tell me how to fix this one.
As you can see, I need the c->gateway value to copy and add a "/" to it.

In C++ this would just be a dead simple append ;)



> >     // Sanitize the gateway to a format we expect.
> > +    if (sanitize_ipfs_gateway(h) < 1)
> > +        goto err;
>
> This will return unset ret, thus leaking data from the stack
>
> > +static int ipfs_close(URLContext *h)
> > +{
> > +    IPFSGatewayContext *c = h->priv_data;
>
> Here is where you'd put any deallocations
>
> The quality of this patch is making me re-affirm what I've already said
> viz parsing. bash+sed is superior.
>

That would be a superior waste of time if that were the outcome.
Let's not go there. It makes other parts much more complicated too.
But i've already explained that in length, no need to repeat myself here
again.

>
> /Tomas
>
>


More information about the ffmpeg-devel mailing list