[FFmpeg-devel] [PATCH 1/2 v2] avutil: add a Tile Grid API

James Almer jamrial at gmail.com
Sun Jan 21 21:29:58 EET 2024


On 1/21/2024 4:02 PM, Anton Khirnov wrote:
> Quoting James Almer (2024-01-21 19:38:50)
>> On 1/21/2024 3:29 PM, Anton Khirnov wrote:
>>> Quoting James Almer (2024-01-21 18:47:43)
>>>> On 1/21/2024 2:29 PM, Anton Khirnov wrote:
>>>>> Honestly this whole new API strikes me as massively overthinking it. All
>>>>> you should need to describe an arbitrary partition of an image into
>>>>> sub-rectangles is an array of (x, y, width, height). Instead you're
>>>>> proposing a new public header, struct, three functions, multiple "tile
>>>>> types", and if I'm not mistaken it still cannot describe an arbitrary
>>>>> partitioning. Plus it's in libavutil for some reason, even though
>>>>> libavformat seems to be the only intended user.
>>>>>
>>>>> Is all this complexity really warranted?
>>>>
>>>> 1. It needs to be usable as a Stream Group type, so a struct is
>>>> required. Said struct needs an allocator unless we want to have its size
>>>> be part of the ABI. I can remove the free function, but then the caller
>>>> needs to manually free any internal data.
>>>
>>> If the struct lives in lavf and is always allocated as a part of
>>> AVStreamGroup then you don't need a public constructor/destructor and
>>> can still extend the struct.
>>
>> Yes, but that would be the case if it's only meant to be allocated by
>> AVStreamGroup and nothing else.
> 
> That is the case right now, no?
> 
> If that ever changes then the constructor can be added.
> 
>>>
>>>> 2. We need tile dimensions (Width and height) plus row and column count,
>>>> which give you the final size of the grid, then offsets x and y to get
>>>> the actual image within the grid meant for presentation.
>>>> 3. I want to support uniform tiles as well as variable tile dimensions,
>>>> hence multiple tile types. The latter currently has no use case, but
>>>> eventually might. I can if you prefer not include said type at first,
>>>> but i want to keep the union in place so it and other extensions can be
>>>> added.
>>>> 4. It's in lavu because its meant to be generic. It can also be used to
>>>> transport tiling and cropping information as stream and packet side
>>>> data, which can't depend on something defined in lavf.
>>>
>>> When would you have tiling information associated with a specific
>>> stream?
>>
>> Can't think of an example for tiling, but i can for cropping. If you
>> insist on not reusing this for non-HEIF cropping usage in mp4/matroska,
>> then ok, I'll move it to lavf.
> 
> I still don't see why should it be a good idea to use this struct for
> generic container cropping. It feels very much like a hammer in search
> of a nail.

Because once we support container cropping, we will be defining a 
stream/packet side data type that will contain a subset of the fields 
from this struct.

If we reuse this struct, we can export a clap box as an AVTileGrid (Or i 
can rename it to AVImageGrid, and tile to subrectangle) either as the 
stream group tile grid specific parameters if HEIF, or as stream side 
data otherwise.

> 
>>>
>>>> And what do you mean with not supporting describing arbitrary
>>>> partitioning? Isn't that what variable tile dimensions achieve?
>>>
>>> IIUC your tiling scheme still assumes that the partitioning is by rows
>>> and columns. A completely generic partitioning could be irregular.
>>
>> A new tile type that doesn't define rows and columns can be added if
>> needed. But the current variable tile type can support things like grids
>> of two rows and two columns where the second row is effectively a single
>> tile, simply by setting the second tile in said row as having a width of 0.
> 
> The problem I see here is that every consumer of this struct then has to
> explicitly support every type, and adding a new type requires updating
> all callers. This seems unnecessary when "list of N rectangles" covers
> all possible partitionings.

Well, the variable type supports a list of N rectangles where each 
rectangle has arbitrary dimensions, and you can do things like having 
three tiles/rectangles that together still form a rectangle, while 
defining row and column count. So i don't personally see the need for a 
new type to begin with.

> 
> That does not mean you actually have to store it that way - the struct
> could be a list of N rectangles logically, while actually being
> represented more efficiently (in the same way a channel layout is always
> logically a list of channels, even though it's often represented by an
> uint64 rather than a malloced array).
> 


More information about the ffmpeg-devel mailing list