[FFmpeg-devel] [PATCH 2/2] avformat/id3v2: Check that decode_str() did advance
Andreas Rheinhardt
andreas.rheinhardt at outlook.com
Wed Apr 16 13:57:27 EEST 2025
Michael Niedermayer:
> Hi softworkz
>
> On Wed, Apr 16, 2025 at 02:52:21AM +0000, softworkz . wrote:
>>
>>
>>> -----Original Message-----
>>> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
>>> Michael Niedermayer
>>> Sent: Mittwoch, 16. April 2025 03:34
>>> To: FFmpeg development discussions and patches <ffmpeg-
>>> devel at ffmpeg.org>
>>> Subject: Re: [FFmpeg-devel] [PATCH 2/2] avformat/id3v2: Check that
>>> decode_str() did advance
>>>
>>> On Wed, Apr 16, 2025 at 01:29:02AM +0000, softworkz . wrote:
>>> [...]
>>>>>> This will cause deserialization errors for many people in the
>>> world
>>>>>> who are processing FFprobe data.
>>>>>
>>>>> As said, ffprobe should not produce troublesome output
>>
>> First of all, any patch MUST NOT introduce behavior that goes
>> against our own specifications.
>>
>> From avformat.h:
>>
>> /**
>> * @defgroup metadata_api Public Metadata API
>> * @{
>> * @ingroup libavf
>> * The metadata API allows libavformat to export metadata tags to a client
>> * application when demuxing. Conversely it allows a client application to
>> * set metadata when muxing.
>> *
>> * Metadata is exported or set as pairs of key/value strings in the 'metadata'
>> * fields of the AVFormatContext, AVStream, AVChapter and AVProgram structs
>> * using the @ref lavu_dict "AVDictionary" API. Like all strings in FFmpeg,
>> * metadata is assumed to be UTF-8 encoded Unicode. Note that metadata
>> * exported by demuxers isn't checked to be valid UTF-8 in most cases.
>> *
>> * Important concepts to keep in mind:
>> * - Keys are unique; there can never be 2 tags with the same key. This is
>> * also meant semantically, i.e., a demuxer should not knowingly produce
>> * several keys that are literally different but semantically identical.
>> * E.g., key=Author5, key=Author6. In this example, all authors must be
>> * placed in the same tag.
>> * - Metadata is flat, not hierarchical; there are no subtags. If you
>> * want to store, e.g., the email address of the child of producer Alice
>> * and actor Bob, that could have key=alice_and_bobs_childs_email_address.
>> * - Several modifiers can be applied to the tag name. This is done by
>> * appending a dash character ('-') and the modifier name in the order
>> * they appear in the list below -- e.g. foo-eng-sort, not foo-sort-eng.
>> * - language -- a tag whose value is localized for a particular language
>> * is appended with the ISO 639-2/B 3-letter language code.
>> * For example: Author-ger=Michael, Author-eng=Mike
>> * The original/default language is in the unqualified "Author" tag.
>> * A demuxer should set a default if it sets any translated tag.
>> * - sorting -- a modified version of a tag that should be used for
>> * sorting will have '-sort' appended. E.g. artist="The Beatles",
>> * artist-sort="Beatles, The".
>>
>>
>> Especially:
>>
>> * E.g., key=Author5, key=Author6. In this example, all authors must be
>> * placed in the same tag.
>>
>> I think, this tells very clearly how it's gotta be and how not.
>
> This is written by me 16 years ago
> and it made sense at the time. Our APIs did not support multiple values per
> key.
> The API supports multiple values per key since 9 years. Using said API is
> more convenient than having to parse and escape ";" around.
> Thats for our code, its simpler for a muxer to iterate over a AVDictionary key
> than parse a ";" seperated string (with undefined escaping rules).
> and easier to build a ";" string from a multi-value AVDictionary than to assume
> the internal ";" escaping rules (whatever they would be) match the target format.
This is all true, but it does not change that we are bound by our API
guarantees.
>
> commit 47146dfbf6bca94dd0706b4313cc5e26edaf18d4
> Author: Michael Niedermayer <michaelni at gmx.at>
> Date: Sun Jan 4 18:48:37 2009 +0000
>
> Generic metadata API.
> avi is updated as example.
> No version bump, the API still might change slightly ...
> No update to ffmpeg.c as requested by aurel.
>
> Originally committed as revision 16424 to svn://svn.ffmpeg.org/ffmpeg/trunk
>
> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> index e45f7d6dfb3..7038d2d2c41 100644
> --- a/libavcodec/avcodec.h
> +++ b/libavcodec/avcodec.h
> @@ -400,6 +400,51 @@ enum SampleFormat {
> */
> #define FF_MIN_BUFFER_SIZE 16384
>
> +
> +/*
> + * public Metadata API.
> + * Important concepts, to keep in mind
> + * 1. keys are unique, there are never 2 tags with equal keys, this is also
> + * meant semantically that is a demuxer should not knowingly produce
> + * several keys that are litterally different but semantically identical,
> + * like key=Author5, key=Author6.
> + * All authors have to be placed in the same tag for the case of Authors.
> + * 2. Metadata is flat, there are no subtags, if you for whatever obscene
> + * reason want to store the email address of the child of producer alice
> + * and actor bob, that could have key=alice_and_bobs_childs_email_address.
> + * 3. A tag whichs value is translated has the ISO 639 3-letter language code
> + * with a '-' between appended. So for example Author-ger=Michael, Author-eng=Mike
> + * the original/default language is in the unqualified "Author"
> + * A demuxer should set a default if it sets any translated tag.
> + */
>
More information about the ffmpeg-devel
mailing list