[FFmpeg-devel] [PATCH 2/2] avformat/id3v2: Check that decode_str() did advance

softworkz . softworkz at hotmail.com
Tue Apr 15 22:59:00 EEST 2025



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Michael Niedermayer
> Sent: Dienstag, 15. April 2025 20:56
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 2/2] avformat/id3v2: Check that
> decode_str() did advance
> 
> On Mon, Apr 14, 2025 at 11:59:02PM +0000, softworkz . wrote:
> >
> >
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> > > Michael Niedermayer
> > > Sent: Dienstag, 15. April 2025 01:20
> > > To: FFmpeg development discussions and patches <ffmpeg-
> > > devel at ffmpeg.org>
> > > Subject: Re: [FFmpeg-devel] [PATCH 2/2] avformat/id3v2: Check that
> > > decode_str() did advance
> > >
> > > On Sat, Apr 12, 2025 at 01:49:53AM +0000, softworkz . wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf
> Of
> > > > > Michael Niedermayer
> > > > > Sent: Samstag, 12. April 2025 00:27
> > > > > To: FFmpeg development discussions and patches <ffmpeg-
> > > devel at ffmpeg.org>
> > > > > Subject: [FFmpeg-devel] [PATCH 2/2] avformat/id3v2: Check that
> > > > > decode_str() did advance
> > > > >
> > > > > Fixes infinite loop with unknown encodings
> > > > >
> > > > > We could alternatively error out from decode_str() or consume
> all
> > > of
> > > > > taglen
> > > > > this would affect other callers though.
> > > > >
> > > > > Fixes: 409819224/clusterfuzz-testcase-minimized-
> > > ffmpeg_dem_H261_fuzzer-
> > > > > 6003527535362048
> > > > > Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> > > > > ---
> > > > >  libavformat/id3v2.c | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/libavformat/id3v2.c b/libavformat/id3v2.c
> > > > > index 90314583a74..e3f7f9e2a90 100644
> > > > > --- a/libavformat/id3v2.c
> > > > > +++ b/libavformat/id3v2.c
> > > > > @@ -341,10 +341,13 @@ static void read_ttag(AVFormatContext
> *s,
> > > > > AVIOContext *pb, int taglen,
> > > > >      taglen--; /* account for encoding type byte */
> > > > >
> > > > >      while (taglen > 1) {
> > > > > +        int current_taglen = taglen;
> > > > >          if (decode_str(s, pb, encoding, &dst, &taglen) < 0) {
> > > > >              av_log(s, AV_LOG_ERROR, "Error reading frame %s,
> > > > > skipped\n", key);
> > > > >              return;
> > > > >          }
> > > > > +        if (current_taglen == taglen)
> > > > > +            return;
> > > > >
> > > > >          count++;
> > > > >
> > > > > --
> > > > > 2.49.0
> > > > >
> > > > > _______________________________________________
> > > >
> > > > Hi Michael,
> > > >
> > > > this kind of conflicts with this patch that I had submitted
> > > recently:
> > > >
> > > >
> > >
> https://patchwork.ffmpeg.org/project/ffmpeg/patch/pull.54.ffstaging.FF
> > > mpeg.1740873449247.ffmpegagent at gmail.com/
> > > >
> > > >
> > > > I wonder whether my patch would still be prone to the issue your
> > > patch is addressing -
> > >
> > > This already conflicts with rcombs patch in git master, i think
> > > Applying: Fixes Trac ticket https://trac.ffmpeg.org/ticket/6949
> > > Using index info to reconstruct a base tree...
> > > M	libavformat/id3v2.c
> > > Falling back to patching base and 3-way merge...
> > > Auto-merging libavformat/id3v2.c
> > > CONFLICT (content): Merge conflict in libavformat/id3v2.c
> > > error: Failed to merge in the changes.
> > > Patch failed at 0001 Fixes Trac ticket
> > > https://trac.ffmpeg.org/ticket/6949
> > >
> > >
> > > > do you have a test file perhaps?
> > >
> > > Will email you one, but the loop with a function that doesnt
> advance
> > > is an issue even if the specific file doesnt trigger it in a
> different
> > > implementation
> > >
> > > also probaly a good idea if you contact rcombs as you seemed to
> work
> > > on
> > > the same code
> > >
> > > I was looking at teh ticket and saw a link to rcombs patch, looked
> at
> > > the patch and applied it. I did not realize there where 2 patches
> >
> >
> > Hi Michael,
> >
> > I know the rcombs patch, but it has a - let's say - different
> behavior.
> > Let's look at an example where artist and genre have multiple
> values:
> >
> >
> > This was ffmpeg output unpatched:
> >
> >   Metadata:
> >     title           : Infinite (Original Mix)
> >     artist          : B-Front
> >     track           : 1
> >     album           : Infinite
> >     date            : 2017
> >     genre           : Hardstyle
> >     TBPM            : 150
> >     compilation     : 0
> >     album_artist    : B-Front
> >     publisher       : Roughstate
> >
> >
> > This is what the rcombs patch does:
> >
> >   Metadata:
> >     title           : Infinite (Original Mix)
> >     artist          : B-Front
> >     artist          : Second Artist Example
> >     track           : 1
> >     album           : Infinite
> >     date            : 2017
> >     genre           : Hardstyle
> >     genre           : Test
> >     genre           : Example
> >     genre           : Hard Dance
> >     TBPM            : 150
> >     compilation     : 0
> >     album_artist    : B-Front
> >     publisher       : Roughstate
> >
> >
> >
> > My path does that:
> >
> >   Metadata:
> >     title           : Infinite (Original Mix)
> >     artist          : B-Front;Second Artist Example
> >     track           : 1
> >     album           : Infinite
> >     date            : 2017
> >     genre           : Hardstyle;Test;Example;Hard Dance
> >     TBPM            : 150
> >     compilation     : 0
> >     album_artist    : B-Front
> >     publisher       : Roughstate
> 
> Iam perfectly fine with either way
> but i have to point out that the 2nd method has some problems too
> 
> for example checking if foo is an author becomes more difficult
> theres also a question about scalability if there are many entries
> 
> And what exactly do you do if an Artist or Title itself contains a ;

I think we need to take a look at the context and real-world use of 
music tagging via ID3 tags by various applications.

The representation of multi-values via null-separated strings is something
that has been added only to the 2.4 version of ID3 spec. It does not
exist in version 2.3 and earlier.

So, what did all the tools and players do before to handle multi-valued
metadata?

Two common practices exist: 

1. Slash Separation

genre           : Hardstyle / Test / Example / Hard Dance


2. Semicolon Separation


genre           : Hardstyle; Test; Example; Hard Dance


When we look at one of the most popular music tagging applications
(MusicBrainz Picard), then we see the following behaviors:

- When opening a file with v2.4 multivalue tags the application UI shows
  them like: Hardstyle; Test; Example; Hard Dance

- When saving with v2.3 tags (and semicolon separation configured)
  it saves them as single string: Hardstyle; Test; Example; Hard Dance

When we run FFprobe (without any patches!) on that v2.3 file, we see:

  Metadata:
    title           : Infinite (Original Mix)
    artist          : B-Front; Second Artist Example
    track           : 1
    album           : Infinite
    date            : 2017
    genre           : Hardstyle; Test; Example; Hard Dance


That means in turn:

- Semicolon-separated multi-values is not my personal "invention"

- This is a long-existing established standard for representation 
  of multi-values metadata entries

- Semicolon separation allows a version-independent round-tripping
  of multi-valued (music) metadata inside FFmpeg

A further improvement would be on the side of ID3 tag writing, to
convert semicolon separation to null-separation.


But even without that, it is the best possible way for us to go,
because:

- It will retrieve and show all values, no longer just the first

- Additional values don't get lost when transcoding 
  (as with the rcombs patch)

- The representation of multi-values - both, internally and when 
  outputting as probe data - is a de-facto standard

- FFprobe output remains valid


Best regards
sw


More information about the ffmpeg-devel mailing list