[FFmpeg-devel] [PATCH] movenc: Add an option for hiding fragments at the end

Sun Jun 2 23:45:40 EEST 2024

On 2024-06-02 21:36, Martin Storsjö wrote:
> On Sat, 1 Jun 2024, Dennis Sädtler via ffmpeg-devel wrote:
>
>> Should the ftyp atom also be updated to remove brands no longer 
>> required for non-fragmented files?
>> I'm not sure how important that is in real-world scenarios, so it 
>> might not be worth it to deal with some of the additional changes 
>> required e.g. to deal with the new ftyp possibly being a different size.
>
> Hmm, good point, I hadn't thought about that. I'd prefer not to do 
> that, as it becomes a bit more of a mess to change the size of the ftyp.

Indeed, though as far as I can tell only the offset and size of the mdat 
would need to be changed, since everything past the ftyp is "disposable" 
anyway.
But as I initially mentioned, I have no idea if there are real-word 
cases of players refusing a file purely based on those brands.

>> Since coincidentally I've implemented the exact same feature in a 
>> different application a couple weeks ago I'll also throw in the fun 
>> fact that files produced this way can be smaller than regular MP4s 
>> for long and/or large files.
>> This is due to the lack of interleaving of A/V samples resulting in 
>> the file having much fewer but larger chunks, which means the moov 
>> atom - mainly the stco/co64 and stsc boxes - can be much smaller.
>
> Oh, indeed, that's a good point. But on the other hand, the file ends 
> up containing all the leftover moof boxes in the mdat. But are you 
> saying that a compact moov + leftover moof, still is smaller than one 
> large moov, in your practical test cases?

Yep, that's what I meant! The initial ("empty") moov and following moof 
boxes aren't all that big, so once you go over a certain threshold 
(which I haven't calculated) this method ends up having a negative overhead.

I have a regular MP4 of a Twitch live stream (1 video + 1 audio track) 
where the moov alone ends up being ~40 MiB, so they can get quite large. 
That may be part of why the newer ISO-BMFF revisions actually have a 
feature for compressing moov/moof/sidx boxes (that nobody has 
implemented as far as I can tell).

> Btw, the patch in this form has one minimal time gap for when the file 
> can end up unrecoverable; we patch the mdat size (hiding the moof 
> boxes) before we write the moov - if we die at that specific moment, 
> we'd have an unreadable file. I guess it should be possible to reorder 
> these two calls as well - but it makes for a slightly bigger patch.

First writing the moov and then hiding the fragmentation is what I ended 
up doing in my implementation. Might be overkill, but would certainly 
make it the "safest" it can be.

In my implementation I also ended up changing the placeholder "free" 
atom at the start to be 16 bytes so that I could write the mdat header 
non-destructively and allow the fragmented structure to be preserved 
entirely. This was mostly done for easier manual recovery in case 
something goes wrong with the full moov, though again, probably overkill.

Finally, I've also had a somewhat cursed thought about having a second 
always-hidden ftyp before the initial moov, which would then allow you 
to use the same file for progressive download and DASH/HLS streaming by 
using range-requests (e.g. via BYTERANGE) to skip the first ftyp + mdat 
header for the init segment and then using the fragments as normal. 
Though that goes beyond the scope of this patch, I just had to get it 
out there in case anybody thinks that might actually be fun to try :P

~Dennis