[FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

James Almer jamrial at gmail.com
Sun Jan 21 19:47:43 EET 2024


On 1/21/2024 2:29 PM, Anton Khirnov wrote:
> Quoting James Almer (2024-01-21 13:06:28)
>> On 1/21/2024 3:27 AM, Anton Khirnov wrote:
>>> Quoting James Almer (2024-01-20 23:04:06)
>>>> This includes a struct and helpers. It will be used to support container level
>>>> cropping and tiled image formats, but should be generic enough for general
>>>> usage.
>>>>
>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>> ---
>>>> Extended to include fields used for cropping. Should make the struct reusable
>>>> even for non tiled images, e.g. setting both rows and tiles to 1, in which case
>>>> tile width and height would become analogous to coded_{witdh,height}.
>>>
>>> But why? What does cropping have to do with tiling? What advantage is
>>> there to handling them in one struct?
>>
>> The struct does not need to be used for non tiled image scenarios, but
>> could if we decide we don't want to add another struct that would only
>> contain a subset of the fields present here.
>>
>> As to why said fields here present here, HEIF may use a clap box to
>> define cropping for the final image, not for the tiles. This needs to be
>> propagated, and the previous version of this API, which only defined
>> cropping from right and bottom edges if output dimensions were smaller
>> than the grid (standard case for tiled heif with no clap box), was not
>> enough. Hence this change.
>>
>> I can rename this struct to Image Grid or something else, which might
>> make it feel less awkward if we decide to reuse it. We still need to
>> propagate container cropping from clap boxes and from Matroska elements
>> after all.
> 
> Honestly this whole new API strikes me as massively overthinking it. All
> you should need to describe an arbitrary partition of an image into
> sub-rectangles is an array of (x, y, width, height). Instead you're
> proposing a new public header, struct, three functions, multiple "tile
> types", and if I'm not mistaken it still cannot describe an arbitrary
> partitioning. Plus it's in libavutil for some reason, even though
> libavformat seems to be the only intended user.
> 
> Is all this complexity really warranted?

1. It needs to be usable as a Stream Group type, so a struct is 
required. Said struct needs an allocator unless we want to have its size 
be part of the ABI. I can remove the free function, but then the caller 
needs to manually free any internal data.
2. We need tile dimensions (Width and height) plus row and column count, 
which give you the final size of the grid, then offsets x and y to get 
the actual image within the grid meant for presentation.
3. I want to support uniform tiles as well as variable tile dimensions, 
hence multiple tile types. The latter currently has no use case, but 
eventually might. I can if you prefer not include said type at first, 
but i want to keep the union in place so it and other extensions can be 
added.
4. It's in lavu because its meant to be generic. It can also be used to 
transport tiling and cropping information as stream and packet side 
data, which can't depend on something defined in lavf.

And what do you mean with not supporting describing arbitrary 
partitioning? Isn't that what variable tile dimensions achieve?


More information about the ffmpeg-devel mailing list