[FFmpeg-devel] [PATCH v3] libavutil: Add AVMap

Michael Niedermayer michael at niedermayer.cc
Sat Apr 19 19:23:54 EEST 2025


Hi

On Sat, Apr 19, 2025 at 03:03:16PM +0200, Nicolas George wrote:
> Michael Niedermayer (HE12025-04-17):
> > ok, i will change teh name
> 
> Thanks.
> 
> > You can implement an av_map_new() variant thats on the stack
> > if thats what you want to have / to use / to do
> > once the code is in git
> > or maybe i will do it but there are many more interresting things
> 
> I have checked, it can be done after the fact. (I am surprised you do
> not consider this interesting).

There are too many interresting things. Its interresting but not
interresting enough. Also its easy to add later


> 
> > yes, mails take alot of time to read and reply to.
> 
> Sorry, I had assumed you would have replied before posting an updated
> version because that is what I would have done.
> 
> 
> Summary of the rest of the message:
> 
> - Keep the core API simple:
> 
>     - no duplicated values;
> 
>     - one compare function set once and for all at creation time.
> 
> - Implement extra features as utility functions on top of the core API.
> 
> 
> > thats neither efficient nor does it fit the existing APIs.
> > The value is most of the time not used and when used it is known exactly
> > where it is. after a string compare of the key thats equal the value location is known
> > or with arbitrary binary data the compare function knows the size of the data.
> > also existing APIs from av_tree strcmp() av_strcasecmp() and so on none of
> > which take 4 pointers from which they would ignore 2.
> 
> The more I think on if, the more I think this idea of giving the value
> to the compare function is a bad idea. It is a break of abstraction,
> unlike most other dictionaries API. The concatenation bit is kind of
> exposing the specifics of the implementation.
> 
> IIUC, the point is to use it to allow multiple values with the same key.
> But it also introduces de-duplication that might not be wanted.
> 
[...]
>
> If API-users want multiple values with de-dupliction like that, it is
> simpler to do on their side by making the value part of the key, and
> using dummy values:
> 
>     "x" => "first"     |    "x:first" => ""
>     "x" => "second"    |    "x:second" => ""
>     "y" => "other"     |    "y:other" => ""
> 
> Callers can choose the best delimiters for their needs.
> 
> If API-users want multiple values without de-duplication, better treat
> the value itself as an array/list somehow. If the code does not treat \0
> as special in the value, we can do a lot of things like easily.
> 
> We can offer utility functions to support these patterns if that happens
> to be useful. It is better than having these minor features affect the
> design of the data structure.

I dont know what you are trying to acomplish here.

The key-value part is a fundamental part of the design and it leads to
multiple advantages.
it makes the code simpler, faster and enables several features that
otherwise would be messy/complex

Also I have no interrest nor time for working on someone elses design
maybe if that design would convince me but this one does not.

I think you should resubmit AVWriter and work on it. I think most people
will respect your design choices unless there are really good technical
reasons to do something differently.
The same way I also expect you to respect my choice in AVMap unless theres
really something better


> 
> > Passing redudant pointers just wastes cpu cycles
> 
> Does it make a non-negligible difference? Because when it comes to extra
> pointers, there is more: real compare function might need a global
> argument to determine their behavior: strcoll_l() and its third argument
> for example.
> 
> If the extra pointer does make a significant difference, I have a
> solution too. It requires a flags argument to the creation function
> would help.

There is no need for a "user context" pointer in the core API because the user
specified key is always passed in the first argument of the compare function.
A struct {Context *c, char *key} can be passed in that if someone really ever
needs this, noone needed this till now it seems.


> 
> > the compare functions are needed purely for setup. And there
> > they are needed because of arbitrary data support. av_map cannot
> > know how to compare your data. (if its not strings)
> 
> Of course, we need to be able to set the compare function. My point it:
> singular: ONE compare function for the whole map, set at creation and
> never changed afterwards.
> 
> > But the existing code is much simpler:
> > 
> >     AVMapEntry e = NULL;
> >     while (e = av_map_get_multiple(set, e, "author", AV_MAP_CMP_CASE_INSENSITIVE | AV_MAP_CMP_KEY) {
> >         printf("Tag key:%s value:%s\n", e->key, e->value);
> >     }
> 
> That could be done as an utility function wrapping the “vs” code below I
> snipped. It is better than letting this complicate the core API.
> 
> > Also the code above will just give wrong results with no warning if the compare
> > function the map is setup with is incompatible with
> 
> I think you forgot the end of the sentence.
> 
> Reading more carefully, the business with the multiple compare function
> makes even less sense to me:
> 
> >>> + *                     it must form a strict total order on all elements you want to store.
> 
> >>> + * @param cmp       compatible compare function that comapres key or keyvalues
> 
> There is only order compatible with a total order is itself. As is, the
> documentation says all the compare functions must return the exact same
> thing, but that cannot be what it means.

 * @param cmp compare function to be added.
 *            cmp(a,b) must return 0 or be equal to the previously added compare function for (a,b), if it returns 0 it also must do so for all
 *            elements between a and b

You are the mathematican, my terminology is likely off by at least a bit if not more

an example:
if the map is initialized with:

int av_map_supercmp_key(const char *a, const char *b){
    int v = av_strcasecmp(a,b);
    if (!v)
        v = strcmp(a,b);

    return v;
}

then av_strcasecmp() is compatible

this allows finding both case sensitive and case insensitve matches.

In code that allows both these with the same map.

AVMapEntry e = NULL;
while (e = av_map_get_multiple(set, e, "author", AV_MAP_CMP_CASE_INSENSITIVE | AV_MAP_CMP_KEY) {
    printf("Tag key:%s value:%s\n", e->key, e->value);
}

AVMapEntry e = NULL;
while (e = av_map_get_multiple(set, e, "author", AV_MAP_CMP_CASE_SENSITIVE | AV_MAP_CMP_KEY) {
    printf("Tag key:%s value:%s\n", e->key, e->value);
}



Basically by chaining increasingly specific compare functions in code
(like av_strcasecmp and strcmp in the example)
it is later possible to stop before all functions are run and have
wider matches (like case insensitve in  the example)

Of course for adding elements, always the same and most specific function
is used so everything is always ordered the same way. Its only the
find/get stuff that can use these other functions
But all this really is under the hood and the user doesnt need to know
the user just asks for case insensitve and gets it as long as the av_map
was settup in a way that allows it.

thx

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Concerning the gods, I have no means of knowing whether they exist or not
or of what sort they may be, because of the obscurity of the subject, and
the brevity of human life -- Protagoras
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250419/e3ca7609/attachment.sig>


More information about the ffmpeg-devel mailing list