[FFmpeg-devel] [PATCH] avformat/hls: fail on probing non hls/m3u8 file extensions

Wed May 10 19:24:27 EEST 2023

On 5/9/23 11:41, Michael Niedermayer wrote:
> On Tue, May 09, 2023 at 08:15:34AM +0200, Anton Khirnov wrote:
>> Quoting Michael Niedermayer (2023-05-09 01:25:00)
>>> On Sun, May 07, 2023 at 10:42:56PM +0200, Anton Khirnov wrote:
>>>> Quoting Michael Niedermayer (2023-05-03 14:30:38)
>>>>> Its unexpected that a .avi or other "standard" file turns into a playlist.
>>>>> The goal of this patch is to avoid this unexpected behavior and possible
>>>>> privacy or security differences.
>>>>>
>>>>
>>>> I very much dislike this approach.
>>>
>>> What else do you suggest ?
>>>
>>> We could have a configuration option that specifies one
>>> or more trusted directories/files/urls. And everything outside would be
>>> limited to selfcontained files
>>> The average user can put * as trusted url if thats what they want
>>> While someone who cares about their privacy and security could restrict
>>> that with little effort to the place where they store their music and
>>> videos which they know is ok. While OTOH still throwing random URLs
>>> at ffmpeg without expecting overly unexpected results (not considering bugs)
>>> Thats similar to how some office software can handle macros.
>>>
>>> Or do you have some other suggestion ?
>>>
>>
>> I don't see what actual problem is this supposed to address. The commit
>> message mentions some vague "possible issues", but
>> * this seems like the wrong layer for this kind of policy decisions
>> * I think there needs to be a clearly defined thread model before we can
>>    discuss what the right solution is
> 
> Its not one threat but many. And i already mentioned some but we can pick
> one hypothetical example for sake of discussion and detail it more.
> (being more detailed this is sadly a longer mail)
> 
> Lets assume an attacker wants to infiltrate a specific company
> part of that will require gaining access to the companies internal
> network. So the attacker starts chating with some employee, the
> medium of communication is irrelevant here.
> The attacker cannot just ask the employee to run his network scanner
> the employee would realize whats going on and report it (or should at least)
> Maybe the attacker asks the emplyee to look at that really funny weblog
> of his, but the emplyee might be trained not to open random links in
> his webbrowser or the webbrowser might be so locked down that it
> cant access the companies internal net.
> So our attacker sends him a link to
> companyX_boss_and_secretary_sex_hd.avi
> So he tries watching that because he now isnt sure if the guy meant that
> was about his boss or someone else, after all he knows its safe because he watched
> so much porn on so many shady sites with his video player and never got
> hacked from it.
> in reality that avi file is not a avi, its maybe some playlist allowing
> arbitrary URLs.
> While the employee watches some boss and secretary the bits any pieces betweem
> the scenes scan the companies internal network and the external http
> accesses which go back to the attacker, allow him to get all the information of
> it.
> 
> The problem is that playing some files allows things like scaning
> the network and passing this information on to an attacker.
> I dont have a real full testcase so maybe theres something iam missing
> why this wouldnt work ...
> 
> compounding that our probing code is really good and detects such attacks
> and makes them work even if the file is named incorrectly and has the wrong
> mime type.
> This makes such an attack easier to do and harder to detect even for a smart
> user
> 
> ideal would be if the whole attack wouldnt work at all and network realms
> would be clearly seperated and nothing touching the companies private net
> or localnet could touch the public internet.
> This is not compatible with a playlist randomly mixing local, priavte and
> public content.
> If its just my own machiene, i have no such playlist and see no use in one
> for me. But some people want that i think.
> They maybe arent the employee who as part of his job has to protect the
> company and not play random playlists.
> Also they maybe arent the privacy or security concious users.
> 
> Having default same origin policy and a list of files/directories/urls that are
> excempt from it would fix this in the example above.
> 
> The companies employees would not override default security settings to
> watch a odd file. Or they could loose their job
> A user caring about security / privacy could setup the settings
> carefully so safe playlists work and other things are secure
> and A user not caring could just disable a same origin check, and
> this may be ok for her
> 

One potential issue is that if an open TLS HTTP/2 socket already exists 
with the server, then the URL isn't being probed, but rather the 
response from a specific GET request.

This issue arose in mpv, because mpv opens an HTTP socket with the 
server and keeps it alive when running its own probe mechanics, before 
falling back to lavf's probe. In the event of an already open socket the 
URL isn't a valid filename, so mpv sends the empty string. The empty 
string doesn't have the .m3u8 extension, so it fails the validation check.

In every other instance of lavf probes, it bases it on content as higher 
priority than filename. Rename a .avi to .mp4 and it will still properly 
detect it as .avi, so this seems to be inconsistent with the existing 
behavior. IMO we should check file contents and MIME type sent by the 
server rather than just the file extension. If it starts with #EXTM3U 
and the other headers etc. and has an appropriate Content-type header, 
that should be sufficient.

In either case, it's unclear to me that, if we keep the check in, we 
should permit empty string as an exception to the check because of the 
HTTP socket issue, or if we should require clients to pass "foo.m3u8" or 
something to work around the problem.

- Leo Izen (Traneptora / thebombzen)