[FFmpeg-devel] [PATCH] avformat/http: Handle IPv6 Zone ID in hostname

Marvin Scholz epirat07 at gmail.com
Wed Jun 4 16:00:51 EEST 2025



On 2 Jun 2025, at 22:29, Rémi Denis-Courmont wrote:

> Le torstaina 22. toukokuuta 2025, 21.38.32 Itä-Euroopan kesäaika Marvin Scholz
> a écrit :
>> When using a literal IPv6 address as hostname,
>> it can contain a Zone ID
>> especially in the case of link-local addresses. Sending this to the
>> server in the Host header is not useful to the server and in some cases
>> servers refuse such requests.
>>
>> To prevent any such issues, strip the Zone ID from the address if it's
>> an IPv6 address. This also removes it for the Cookies lookup.
>>
>> Based on a patch by: Daniel N Pettersson <danielnp at axis.com>
>> ---
>>  libavformat/http.c | 60 +++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 59 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavformat/http.c b/libavformat/http.c
>> index f7b2a8a029..3bde616b43 100644
>> --- a/libavformat/http.c
>> +++ b/libavformat/http.c
>> @@ -24,6 +24,7 @@
>>  #include "config.h"
>>  #include "config_components.h"
>>
>> +#include <string.h>
>>  #include <time.h>
>>  #if CONFIG_ZLIB
>>  #include <zlib.h>
>> @@ -209,6 +210,63 @@ void ff_http_init_auth_state(URLContext *dest, const
>> URLContext *src) sizeof(HTTPAuthState));
>>  }
>>
>> +static bool host_is_numeric_ipv6(const char *host)
>> +{
>> +    bool res = false;
>> +#if defined(AF_INET6)
>> +    struct addrinfo hints = { .ai_flags = AI_NUMERICHOST }, *ai;
>> +    if (getaddrinfo(host, NULL, &hints, &ai) == 0) {
>> +        if (ai->ai_family == AF_INET6)
>> +            res = true;
>> +        freeaddrinfo(ai);
>> +    }
>> +#else
>> +    // Just guess based on if the host contains a ':'
>> +    if (strchr(host, ':') != NULL)
>> +        res = true;
>> +#endif
>> +    return res;
>> +}

Hi, thanks for the review.

>
> At least in a URL, the distinction is done by the presence of surrounding
> brackets, not actually parsing the address. And on the flip side, to my
> knowledge, there are no guarantees that getaddrinfo() even copes well with
> scope IDs. That's platform-dependent.

Given the first thing that is done to the URL is to split it into
components, that removes the [] so I can't just check for those, sadly.

Anyway if this does not work, it is a pre-existing issue as this is
essentially the same that ff_url_join did internally before. Not
saying we should not fix it, but apparently no one ran into this
till now.

I guess I can just check for presence of : though as I cant think
of any valid case where a non-IPv6 host would contain a :?

>
>> +
>> +/**
>> + * Copy the normalized host to the given buffer
>> + *
>> + * If the host is a normal hostname, this just returns
>> + * host:port. However in case of an IPv6 address, it
>> + * ensures proper escaping with [] and removes the
>> + * zone identifier, if any, making the return suitable
>> + * for example for use in the HTTP Host header.
>> + */
>> +static unsigned copy_normalized_host(char *out, unsigned size,
>> +                               const char *host, const int port)
>> +{
>> +    AVBPrint bp;
>> +    av_bprint_init_for_buffer(&bp, out, size);
>> +
>> +    if (host_is_numeric_ipv6(host)) {
>> +        // This is an IPv6 address, so we need to strip the Zone ID,
>> +        // if any.
>> +        // While technically we could have percent encoding even in
>> +        // the Zone ID, this doesn't seem to be a relevant case in
>> +        // the real world on any platform.
>> +        char *percent = strrchr(host, '%');
>
> Uh, doesn't Linux actually use % in interface names sometimes?

I never encountered that, but I can just use strchr here
anyway I just realized as percent encoding is not used in other
cases anyway, in the hostname.

>
>> +        if (percent) {
>> +            int len = (percent - host);
>> +            av_bprintf(&bp, "[%.*s]", len, host);
>> +        } else {
>> +            av_bprintf(&bp, "[%s]", host);
>> +        }
>> +    } else {
>> +        // Host is not an IPv6 address, so just use as-is
>> +        av_bprintf(&bp, "%s", host);
>> +    }
>
> This looks like reverse abstraction and kinda sketchy. How do you end up with
> a scope ID in the input?

Simplest example is a user providing a URL with a IPv6 literal that has a scope ID.

>
> While it's true that it shouldn't be sent to the server, it's also so that it
> shouldn't appear in the URL.

There is RFC 6874 that solely focuses an how to represent a scope ID in
a URI, why is that invalid? And if it were, how would the user tell ffmpeg
which interface to use for link local IPv6?

> In other words, it should have been stripped
> earlier than the HTTP input module. The Host field should be the same as the
> host in the absolute URL.

I don't think there is an earlier point that makes sense to strip it?
I am stripping it in the internal http context open function, I can not
strip it before that as the underlying connection still needs to know the
proper zone id to correctly establish the connection using the right
interface.

Unless I am missing something or misunderstand what you meant, I am not
sure how I can change this.



>
> -- 
> Rémi Denis-Courmont
> Villeneuve de Tapiola, ex-République finlandaise d´Uusimaa
>
>
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".


More information about the ffmpeg-devel mailing list