[FFmpeg-devel] [PATCH 2/2] aviobuf: Avoid clearing the whole buffer in fill_buffer

Martin Storsjö martin at martin.st
Tue Mar 21 22:24:25 EET 2023


On Tue, 21 Mar 2023, Marton Balint wrote:

>
>
> On Tue, 21 Mar 2023, Martin Storsjö wrote:
>
>> Normally, fill_buffer reads in one max_packet_size/IO_BUFFER_SIZE
>> worth of data into the buffer, slowly filling the buffer until it
>> is full.
>>
>> Previously, when the buffer was full, fill_buffer would start over
>> from the start, effectively discarding all the previously buffered
>> data.
>>
>> For files that are read linearly, the previous behaviour was fine.
>>
>> For files that exhibit some amount of nonlinear read patterns,
>> especially mov files (where ff_configure_buffers_for_index
>> increases the buffer size to accomodate for the nonlinear reading!)
>> we would mostly be able to seek within the buffer - but whenever
>> we've hit the maximum buffer size, we'd discard most of the buffer
>> and start over with a very small buffer, so the next seek backwards
>> would end up outside of the buffer.
>>
>> Keep one fourth of the buffered data, moving it to the start of
>> the buffer, freeing the rest to be refilled with future data.
>>
>> For mov files with nonlinear read patterns, this almost entirely
>> avoids doing seeks on the lower IO level, where we previously would
>> end up doing seeks occasionally.
>
> Maybe the demuxer should use ffio_ensure_seekback() instead if it knows
> that a seekback will happen? Unconditional memmove of even fourth of all 
> data does not seem like a good idea.

Right, it's probably not ideal to do this unconditionally.

However, it's not that the demuxer really knows that a seekback _will_ 
happen - unless we make it inspect the next couple index entries. And I 
don't think we should make the demuxer pre-analyze the next access 
locations, but keep optimization like this on the separate layer. That 
way, it works as expected as long as the seeks are short enough within the 
expected tolerance, and falls back graciously on regular seeking for the 
accesses that are weirder than that.

If we'd use ffio_ensure_seekback(), we'd make it mandatory for the aviobuf 
layer to cache the data for any insane accesses.

Some stats on the file I'm dealing with: The file is >2 GB, and is not 
exactly interleaved like the mov demuxer reads it, but roughly - when 
demuxing, the mov demuxer mostly jumps back/forward within a maybe ~2 MB 
range. But at the start and end of the file, there's a couple samples that 
are way out of order, causing it to do seeks from one end of the file to 
the other and back. So in that case, if we'd do ffio_ensure_seekback(), 
we'd end up allocating a 2 GB seekback buffer.

Currently, ff_configure_buffers_for_index() correctly measures that it 
needs a large buffer to avoid seeks in this file. (The function finds a 
huge >2 GB pos_delta when inspecting all sample combinations in the file, 
but setting it to the maximum of 16 MB already helps a whole lot, see 
patch 1/2.)

So maybe we could have ff_configure_buffers_for_index set some more flags 
to opt into behaviour like this?

// Martin


More information about the ffmpeg-devel mailing list