[FFmpeg-devel] [PATCH V6 4/6] lavu: add side data AV_FRAME_DATA_BOUNDING_BOXES

Guo, Yejun yejun.guo at intel.com
Mon Apr 5 12:56:17 EEST 2021



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Nicolas
> George
> Sent: 2021年4月4日 18:02
> To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V6 4/6] lavu: add side data
> AV_FRAME_DATA_BOUNDING_BOXES
> 
> Guo, Yejun (12021-04-01):
> > > Is this allowed to be negative? Can/should this be size_t?
> > There was a long discussion about size_t/int/uint32_t when I added
> > struct AVRegionOfInterest in frame.h, and finally size_t is not chosen.
> 
> Then at least unsigned.

yes, 'unsigned int' is another option. but imho, it might be better to use
the same data type of AVFrame.width/height, they are 'int'.

> 
> > yes, we can add a version number (enum AVBBoxHeaderVersion, see below)
> at the
> > beginning of the struct, and the user needs to set it with
> AV_BBOX_HEADER_CURRENT.
> > It is not elegant, is this what we really want?
> 
> A version number does not make the structure compatible. At best,
> applications check the version and detect they do not support this one
> and report an error. At worse, they do not bother to check and it causes
> strange bugs. The version number does not help with compatibility, it
> just makes it detectable.

My idea was to use separate code path for AVBoundingBoxHeaderV1, 
AVBoundingBoxHeaderV2, ... (not union in my previous email) according to
the version number. And yes, it doesn't work if the user does not set or
check the version number correctly.

> 
> To make the structure compatible, we have to ask ourselves: What happens
> if we update it? What breaks compatibility? Once we have an answer, we
> find a workaround.
> 
> In this case, what happens is that the variable array must be at the
> end, and therefore its offset changes. And we cannot have a second
> variable array (like the name! you had to set a constant size), and we
> cannot update the type of the array elements.
> 
> And the solution becomes obvious: let us store the offsets in the
> structure.

Thanks a lot for your nice explanation! thank you.

> 
> So, that would be:
> 
> 	typedef struct AVBoundingBoxHeader {
> 	...
> 	    /**
> 	     * Offset of the array of bounding boxes.
> 	     */
> 	    size_t bboxes_offset;
> 
> 	    /**
> 	     * Offset increment in the array of bounding boxes.
> 	     */
> 	    size_t bboxes_step;
> 	};
> 
> Note that with that solution, we do not need the empty array, we can do
> this:
> 
> AVBoundingBoxHeader *ff_bounding_box_alloc(size_t nb_bbox)
> {
>     struct {
> 	AVBoundingBoxHeader header;
> 	AVBoundingBox boxes[nb_bbox];
>     } *ret;
>     ret = av_mallocz(sizeof(*ret));
>     /* add error checks */
>     ret->header->bboxes_offset = (char *)&ret->boxes - (char *)ret->header;
>     ret->header->bboxes_step = sizeof(*ret->boxes);
> }
> 
> #define AV_BOUNDING_BOX_GET(header, idx) \
>     ((AVBoundingBox *)((char *)(header) + (header)->bboxes_offset + (idx) *
> (header)->bboxes_step))

thanks, and I'll update the naming as what James mentioned.

> 
> You can do the same to lift the 128 limit on the name.

yes, we can now handle more variable arrays with this method.

For the special case 'char source[128]' in AVBoundingBoxHeader, it will be written
for several times, for example, vf_dnn_detect.c writes model name of detection
into source, and later, vf_dnn_classify.c (in plan) will append the model name of
classification into source. So, source will be at the end of the side data, and the side
data will be reallocated when needed.

> 
> Regards,
> 
> --
>   Nicolas George


More information about the ffmpeg-devel mailing list