[FFmpeg-devel] [PATCH v2] Replace arrays of pointers to strings by arrays of strings

Andreas Rheinhardt andreas.rheinhardt at gmail.com
Tue Jan 12 03:41:57 EET 2021


Marton Balint:
> 
> 
> On Wed, 6 Jan 2021, Andreas Rheinhardt wrote:
> 
>> When the difference of the longest size and the average size of
>> collection of strings is smaller than the size of a pointer, it makes
>> sense to store the strings directly in an array instead of using an
>> array of pointers to strings (unless doing so precludes deduplicating
>> strings); doing so also avoids relocations. This requirement is
>> fulfilled for several arrays of pointers to strings that are changed in
>> this commit.
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
>> ---
>> Now used sizeof("longest string") for the array element size.
> 
> Duplicating the "longest string" is very ugly, I don't find it better
> than a constant number, because a sizeof still does not say it tries to
> find the longest string.
> 
> TBH I am not a fan of these kind of patches for a few byte of savings,
> and when changing the list of strings it can actually become worse which
> won't be checked by anybody...
> 
When using position independent code, each of these pointers that is
different from NULL requires a relocation (on ELF, these are quite
expensive: 24 byte when using 64 byte systems), so that these arrays of
pointers can't be in .rodata, but only in .data (or .data.rel.ro). Even
worse, these static const objects aren't initialized in a lazy manner,*
so that pages containing pointers to be relocated are automatically
dirty even when they are actually never used. I think it is of the
utmost importance to counter this and this patch does so in a few instances.
Of course I am all ears for how to make it clear that someone who
modifies the strings also needs to check the array dimensions.

(Here is a snippet for those who want to test this for themselves:

+typedef struct Entry {
+    const char *str;
+    char array[4088];
+} Entry;
+
+#define REPEAT(...) __VA_ARGS__, __VA_ARGS__, __VA_ARGS__, __VA_ARGS__
+#define REPEAT2(...)
REPEAT(REPEAT(REPEAT(REPEAT(REPEAT(REPEAT(REPEAT(__VA_ARGS__)))))))
+
+const Entry unused[16384] = { REPEAT2({ "", "" }) };
+
)

- Andreas

*: It seems that Windows does better:
https://devblogs.microsoft.com/oldnewthing/20160413-00/?p=93301


More information about the ffmpeg-devel mailing list