[FFmpeg-devel] [PATCH V3 3/3] libavfilter: add filter dnn_detect for object detection

Andreas Rheinhardt andreas.rheinhardt at gmail.com
Mon Mar 1 17:28:01 EET 2021


Nicolas George:
> Andreas Rheinhardt (12021-03-01):
>>> thanks for the info, this struct is expected to be in side_data in the future, 
>>> I'll add 'bboxes[1]' in it, and allocate sizeof(*header) + (nb_bbox - 1) * sizeof(*bbox).
>>
>> Notice that in this case it is undefined behaviour to access any of the
>> boxes outside of BoundingBoxHeader (i.e. when using header->bboxes[i],
>> the compiler is allowed to infer that i == 0 as all other cases would be
>> undefined behaviour).
> 
> Are you sure about it? Can you quote the standard?
> 
> Anyway, even if this is true, we can work around it with an extra
> pointer or a cast, possibly wrapped in a macro. Saving a few dynamic
> allocation is well worth the unusual code; we should shoo away the
> people who oppose to go work on GStreamer or something.
> 
1. The C standards committee seems to think so (at least, it did so a
few decades ago; the current composition has probably changed considerably):
"The validity of this construct has always been questionable.  In the
response to one Defect Report, the Committee decided that it was
undefined behavior because the array p->itemscontains only one item,
irrespective of whether the space exists." (p.74 from
http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf)
2. 6.5.6 (the section on pointer arithmetic) contains this:
"When an expression that has integer type is added to or subtracted from
a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough,"
It also contains a provision that pointer arithmetic is legal up to one
past the last element of the array as long as the pointer pointing one
past the last element is not accessed. Everything else is UB.
So whether using [1] as a flexible-array member is UB or not depends
upon whether the array object still counts as containing only one
element when it is declared as [1] even when the allocated size is
sufficient for many more elements. Let's call them the small- and
big-object interpretations.
[1] contains a bug report where someone claims that using [1] led to a
miscompilation and fixed it by using [0] (a GCC extension). Yet said
code also did not put the [1] array at the end of the struct, so IMO it
doesn't count.
Compilers seem to intentionally treat [0] and [1] special in order not
to break code using this feature. When accessing array[3] of a two
element array located at the end of a struct, I get a warning from GCC
and Clang; I get no such warning for a one element array.
Also notice that when using an array of arrays GCC uses the small-object
interpretation: In the code fragment

extern int a[][2];

int f (int i, int j)
{
  int t = a[1][j];
  a[0][i] = 0;
  return a[1][j] - t;
}

GCC does always returns 0, i.e it treats it as UB if i were outside of
0..1, so that a[0][i] and a[1][j] don't alias. Yet if the big-object
interpretation is the correct one, then this optimization is invalid: a
is first transformed to a pointer to its first element which (as an
array) is transformed to a pointer to its first element and then pointer
arithmetic (adding i) is applied to this pointer. And with the
big-object interpretation the compiler knows nothing about the real size
of the object it points to (except that it is at least two ints big).
If one makes the implicit array-to-pointer transformation explicit (i.e.
uses "(&a[0][0])[i] = 0;" or "((int*)(a[0]))[i] = 0;"), then GCC no
longer makes this optimization.
Other compilers are not that aggressive [2].
3. As you probably guessed from the last part of 2., I don't think that
casting avoids the possibility of undefined behaviour* (because this
conversion is done automatically anyway); but it might very well protect
the code from aggressively optimizing compilers.

- Andreas

[1]: https://lkml.org/lkml/2015/2/18/407
[2]: https://godbolt.org/z/j46bMn

*: In the int a[][2] case using ((int*)a)[i] would indeed avoid the
undefined behaviour, because even with the little-object interpretation
the object that is pointed to is the big one. But for the cases that you
are interested in this is not so.


More information about the ffmpeg-devel mailing list