[FFmpeg-devel] [PATCH] movenc: Add an option for hiding fragments at the end

Mon Jun 3 10:51:56 EEST 2024

On Sun, 2 Jun 2024, Dennis Sädtler wrote:

> On 2024-06-02 21:36, Martin Storsjö wrote:
>> On Sat, 1 Jun 2024, Dennis Sädtler via ffmpeg-devel wrote:
>> 
>>> Should the ftyp atom also be updated to remove brands no longer required 
>>> for non-fragmented files?
>>> I'm not sure how important that is in real-world scenarios, so it might 
>>> not be worth it to deal with some of the additional changes required e.g. 
>>> to deal with the new ftyp possibly being a different size.
>> 
>> Hmm, good point, I hadn't thought about that. I'd prefer not to do that, as 
>> it becomes a bit more of a mess to change the size of the ftyp.
>
> Indeed, though as far as I can tell only the offset and size of the mdat 
> would need to be changed, since everything past the ftyp is "disposable" 
> anyway.

Oh, right - yes, that's right, it wouldn't be all that hard to do it.

> But as I initially mentioned, I have no idea if there are real-word cases of 
> players refusing a file purely based on those brands.

Yeah, not sure. And anyway, it shouldn't be too hard to avoid using 
fragmentation modes that requires a higher major brand.

>>> Since coincidentally I've implemented the exact same feature in a 
>>> different application a couple weeks ago I'll also throw in the fun fact 
>>> that files produced this way can be smaller than regular MP4s for long 
>>> and/or large files.
>>> This is due to the lack of interleaving of A/V samples resulting in the 
>>> file having much fewer but larger chunks, which means the moov atom - 
>>> mainly the stco/co64 and stsc boxes - can be much smaller.
>> 
>> Oh, indeed, that's a good point. But on the other hand, the file ends up 
>> containing all the leftover moof boxes in the mdat. But are you saying that 
>> a compact moov + leftover moof, still is smaller than one large moov, in 
>> your practical test cases?
>
> Yep, that's what I meant! The initial ("empty") moov and following moof boxes 
> aren't all that big, so once you go over a certain threshold (which I haven't 
> calculated) this method ends up having a negative overhead.

Oh, that's interesting. (I guess it could be useful to do some sort of 
chunking when writing regular mp4s as well, to achieve the same sort of 
efficiency right away.)

> I have a regular MP4 of a Twitch live stream (1 video + 1 audio track) where 
> the moov alone ends up being ~40 MiB, so they can get quite large. That may 
> be part of why the newer ISO-BMFF revisions actually have a feature for 
> compressing moov/moof/sidx boxes (that nobody has implemented as far as I can 
> tell).

>> Btw, the patch in this form has one minimal time gap for when the file can 
>> end up unrecoverable; we patch the mdat size (hiding the moof boxes) before 
>> we write the moov - if we die at that specific moment, we'd have an 
>> unreadable file. I guess it should be possible to reorder these two calls 
>> as well - but it makes for a slightly bigger patch.
>
> First writing the moov and then hiding the fragmentation is what I ended up 
> doing in my implementation. Might be overkill, but would certainly make it 
> the "safest" it can be.

Yep, that'd be my idea as well. Not really overkill IMO, it's just that 
the existing code does it in this order, and changing it makes for a 
larger patch. Possibly as a later separate step maybe.

> In my implementation I also ended up changing the placeholder "free" atom at 
> the start to be 16 bytes so that I could write the mdat header 
> non-destructively and allow the fragmented structure to be preserved 
> entirely. This was mostly done for easier manual recovery in case something 
> goes wrong with the full moov, though again, probably overkill.

Oh, that sounds quite useful. Sounds like a good idea overall to have, 
maybe as a follow-up improvement here as well?

> Finally, I've also had a somewhat cursed thought about having a second 
> always-hidden ftyp before the initial moov, which would then allow you to use 
> the same file for progressive download and DASH/HLS streaming by using 
> range-requests (e.g. via BYTERANGE) to skip the first ftyp + mdat header for 
> the init segment and then using the fragments as normal. Though that goes 
> beyond the scope of this patch, I just had to get it out there in case 
> anybody thinks that might actually be fun to try :P

Oh, that sounds quite cursed indeed. I definitely can see the appeal of it 
:-) I guess it'd require some custom tooling to parse out the relevant 
byte ranges from it though (or maybe just listening to the avio marker 
callbacks?).

Anyway, Timo had thoughts about the name for this option/flag - do you 
have any suggestions to follow up with on that thread?

// Martin