[FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

James Almer jamrial at gmail.com
Sun Jan 21 23:03:10 EET 2024


On 1/21/2024 4:29 PM, James Almer wrote:
> On 1/21/2024 4:02 PM, Anton Khirnov wrote:
>> Quoting James Almer (2024-01-21 19:38:50)
>>> On 1/21/2024 3:29 PM, Anton Khirnov wrote:
>>>> Quoting James Almer (2024-01-21 18:47:43)
>>>>> On 1/21/2024 2:29 PM, Anton Khirnov wrote:
>>>>>> Honestly this whole new API strikes me as massively overthinking 
>>>>>> it. All
>>>>>> you should need to describe an arbitrary partition of an image into
>>>>>> sub-rectangles is an array of (x, y, width, height). Instead you're
>>>>>> proposing a new public header, struct, three functions, multiple 
>>>>>> "tile
>>>>>> types", and if I'm not mistaken it still cannot describe an arbitrary
>>>>>> partitioning. Plus it's in libavutil for some reason, even though
>>>>>> libavformat seems to be the only intended user.
>>>>>>
>>>>>> Is all this complexity really warranted?
>>>>>
>>>>> 1. It needs to be usable as a Stream Group type, so a struct is
>>>>> required. Said struct needs an allocator unless we want to have its 
>>>>> size
>>>>> be part of the ABI. I can remove the free function, but then the 
>>>>> caller
>>>>> needs to manually free any internal data.
>>>>
>>>> If the struct lives in lavf and is always allocated as a part of
>>>> AVStreamGroup then you don't need a public constructor/destructor and
>>>> can still extend the struct.
>>>
>>> Yes, but that would be the case if it's only meant to be allocated by
>>> AVStreamGroup and nothing else.
>>
>> That is the case right now, no?
>>
>> If that ever changes then the constructor can be added.
>>
>>>>
>>>>> 2. We need tile dimensions (Width and height) plus row and column 
>>>>> count,
>>>>> which give you the final size of the grid, then offsets x and y to get
>>>>> the actual image within the grid meant for presentation.
>>>>> 3. I want to support uniform tiles as well as variable tile 
>>>>> dimensions,
>>>>> hence multiple tile types. The latter currently has no use case, but
>>>>> eventually might. I can if you prefer not include said type at first,
>>>>> but i want to keep the union in place so it and other extensions 
>>>>> can be
>>>>> added.
>>>>> 4. It's in lavu because its meant to be generic. It can also be 
>>>>> used to
>>>>> transport tiling and cropping information as stream and packet side
>>>>> data, which can't depend on something defined in lavf.
>>>>
>>>> When would you have tiling information associated with a specific
>>>> stream?
>>>
>>> Can't think of an example for tiling, but i can for cropping. If you
>>> insist on not reusing this for non-HEIF cropping usage in mp4/matroska,
>>> then ok, I'll move it to lavf.
>>
>> I still don't see why should it be a good idea to use this struct for
>> generic container cropping. It feels very much like a hammer in search
>> of a nail.
> 
> Because once we support container cropping, we will be defining a 
> stream/packet side data type that will contain a subset of the fields 
> from this struct.
> 
> If we reuse this struct, we can export a clap box as an AVTileGrid (Or i 
> can rename it to AVImageGrid, and tile to subrectangle) either as the 
> stream group tile grid specific parameters if HEIF, or as stream side 
> data otherwise.
> 
>>
>>>>
>>>>> And what do you mean with not supporting describing arbitrary
>>>>> partitioning? Isn't that what variable tile dimensions achieve?
>>>>
>>>> IIUC your tiling scheme still assumes that the partitioning is by rows
>>>> and columns. A completely generic partitioning could be irregular.
>>>
>>> A new tile type that doesn't define rows and columns can be added if
>>> needed. But the current variable tile type can support things like grids
>>> of two rows and two columns where the second row is effectively a single
>>> tile, simply by setting the second tile in said row as having a width 
>>> of 0.
>>
>> The problem I see here is that every consumer of this struct then has to
>> explicitly support every type, and adding a new type requires updating
>> all callers. This seems unnecessary when "list of N rectangles" covers
>> all possible partitionings.
> 
> Well, the variable type supports a list of N rectangles where each 
> rectangle has arbitrary dimensions, and you can do things like having 
> three tiles/rectangles that together still form a rectangle, while 
> defining row and column count. So i don't personally see the need for a 
> new type to begin with.

I could remove the types and the union altogether and leave only the 
array even for uniform tiles if you think that simplifies the API, but 
seems like a waste of memory to allocate a rows x cols array of ints 
just to have the same value written for every entry.


More information about the ffmpeg-devel mailing list