[FFmpeg-devel] [PATCH 11/23] lavc/ass_split: fix parsing utf8 scripts
John Stebbins
jstebbins at jetheaddev.com
Mon Apr 6 21:27:08 EEST 2020
On Mon, 2020-04-06 at 20:08 +0200, Nicolas George wrote:
> John Stebbins (12020-04-06):
> > The [Script Info] section was skipped if starts with UTF8 BOM
> > ---
> > libavcodec/ass_split.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/libavcodec/ass_split.c b/libavcodec/ass_split.c
> > index 67da7c6d84..94c32667af 100644
> > --- a/libavcodec/ass_split.c
> > +++ b/libavcodec/ass_split.c
> > @@ -354,6 +354,9 @@ static int ass_split(ASSSplitContext *ctx,
> > const char *buf)
> > if (ctx->current_section >= 0)
> > buf = ass_split_section(ctx, buf);
> >
> > + if(!memcmp(buf, "\xef\xbb\xbf", 3)) { // Skip UTF-8 BOM header
> > + buf += 3;
> > + }
>
> This doesn't look correct: BOM should be skipped only at the very
> beginning of the file. And the braces could be skipped.
>
> > while (buf && *buf) {
> > if (sscanf(buf, "[%15[0-9A-Za-z+ ]]%c", section, &c) == 2)
> > {
> > buf += strcspn(buf, "\n");
>
>
Oh, whoops, I missed that ass_split gets called for a number of things.
This belongs at the beginning of ff_ass_split() I believe?
In the sample I ran into this with, there's a BOM at the beginning of
the mkv private data for the track.
I'll remove the braces...
More information about the ffmpeg-devel
mailing list