[FFmpeg-devel] [PATCH v2] Add SRV3 decoder/demuxer
Peter Ross
pross at xvid.org
Sat Dec 28 04:34:47 EET 2024
On Fri, Dec 27, 2024 at 12:16:04AM +0100, fishhh wrote:
> > instructions on how to obtain a srv3 sample file would be helpful too.
> > this is so we can actually test your patch.
>
> downloading the file from YouTube is probably the easiest way,
> you can achieve this by the means of yt-dlp like this:
> yt-dlp --skip-download --write-subs --sub-langs en --sub-format srv3 <URL>
> --skip-download makes it so the video itself is not downloaded
>
> if you need a video with complex subtitles for testing, this
> one exists: https://www.youtube.com/watch?v=gxWAM5mgH_o
>
> also I just realized adding `ctx->q.keep_duplicates = 1;`
> somewhere in srv3_read_header significantly improves the quality
> of some conversions because it keeps the queue from removing
> events with the same content but different styles
> this also prompted me to make EDGE_GLOW emit a smaller glow
> (`{\\bord1\\blur1}`) which looks way better
> you can test with those changes if you care about the result's quality
ok got working.
> > usually we leave the version.h modifications out of patches posted to
> > the
> > mailining list, and only make those changes at git push time. this makes
> > it
> > easier to apply your patch in the future when those files may have
> > changed.
> > also for new codec we bump avcodec/version.h.
>
> right, makes perfect sense, the documentation at
> https://ffmpeg.org/developer.html#New-codecs-or-formats-checklist
> should probably be updated to not tell you to change it yourself
>
> > this patch sets adds a demuxer and a decoder. imho, it should be split
> > into
> > two patches. the decoder patch comes first, then the demuxer.
> > ...
> > allcodecs.c sections are sorted by name.
>
> okay, I'll do those things for v3.
>
> > > +enum SRV3PenAttrs {
> > > + SRV3_PEN_ATTR_ITALIC = 1,
> > > + SRV3_PEN_ATTR_BOLD = 2,
> > > +};
> >
> > you use 0 elsewhere in the code.
> > should there be a SRV3_PEN_ATTR_NONE too?
>
> the difference between PenAttrs and the other enums is that
> PenAttrs is a bitfield because the pen can be both italic and bold
> of course.
> now that I've looked into this it does seem ffmpeg uses macros
> for defining flags so maybe I should just make these
> #define SRV3_PEN_ATTR_ITALIC (1 << 0)
> #define SRV3_PEN_ATTR_BOLD (1 << 1)
>
> > > +typedef struct SRV3Pen {
> > > + int id;
> > > +
> > > + int font_size, font_style;
> > > + int attrs;
> > > +
> > > + int edge_type, edge_color;
> > > +
> > > + int ruby_part;
> >
> > these should be enum SRV3PenAttrs, enum SRV3EdgeType, enum SRV3RubyPart,
> > instead of int. this helps static analysis. HOWEVER, as part of reading
> > in the values you assume they are int and this approach has merrit too.
>
> yeah I didn't do that specifically so I could just pass these as
> int* to the parsing functions, it would be nice if we could just do
> enum SRV3EdgeType : int {}
> but that's a C23 thing so not really acceptable.
>
> > = SRV3_EDGE_NONE
>
> > > +#define FREE_LIST(type, list, until) \
> > > +do { \
> > > + for (void *current = list; current && current != until; current
> > > = next) { \
> > > + next = ((type*)current)->next; \
> > > + av_free(current); \
> > > + } \
> > > +} while(0)
> >
> > syyle: the \'s should all line up
>
> will fix
>
> > > +
> > > +static int srv3_probe(const AVProbeData *p)
> > > +{
> > > + if (strstr(p->buf, "<timedtext format=\"3\">"))
> > > + return AVPROBE_SCORE_MAX;
> > > +
> >
> > use strstr(p->buf, "<timedtext") && strstr(p->buf, "format=\"3\"");
> >
> > this handle the situaton wehere an extra tag is inserted between the
> > two, e.g.
> > <timedtext x="y" format="3" ....>
> >
>
> okay
>
> > > +
> > > +static int srv3_parse_numeric_attr(SRV3Context *ctx, const char
> > > *parent, xmlAttrPtr attr, int *out, int min, int max)
> > > +{
> > > + return srv3_parse_numeric_value(ctx, parent, attr->name,
> > > attr->children->content, 10, out, min, max) == 0;
> > > +}
> > > +
> > > +static void srv3_parse_color_attr(SRV3Context *ctx, const char
> > > *parent, xmlAttrPtr attr, int *out)
> > > +{
> > > + srv3_parse_numeric_value(ctx, parent, attr->name,
> > > attr->children->content + (*attr->children->content == '#'), 16,
> > > out, 0, 0xFFFFFF);
> > > +}
> > > +
> >
> > for both functions, should probably return the
> > srv3_parse_numeric_value() error value
>
> well actually the only reason srv3_parse_numeric_value returns an
> error value is because attributes like "p" and "wp" have to
> process the number if it is parsed successfully to find the
> structure with the right id. I can make parse_color_attr return an error
> but it'd be ignored in all cases.
>
> > srv3_parse_numeric_value() returns an error, but you are discarding it
> > everywhere here.
>
> the error value from srv3_parse_numeric_value is discarded to allow
> parsing slightly incorrect subtitles since a single event's invalid
> opacity really shouldn't abort parsing and srv3_parse_numeric_value
> already logs an error so its return value is only necessary in the
> special cases I mentioned above.
>
> > i would probably convert these attributes ("sz", "fs", "et", etc.), the
> > pen struct offsets, and
> > min/max values into an array, and then iterate through that array
> > searching for the attribute name.
> > something like AVOption.
>
> nice idea I'll do that
>
> > > +static int srv3_clean_segment_text(char *text) {
> > > + char *out = text;
> > > + const char *start = text;
> > > +
> > > + while (1) {
> > > + const char *end = strstr(start, ZERO_WIDTH_SPACE);
> > > + size_t cnt = end ? (size_t)(end - start) :
> > > (size_t)strlen(start);
> >
> > strlen returns size_t, should be no need to cast it.
> right thanks
>
> > ditto for end-start;
> I think pointer subtraction returns ptrdiff_t which is supposed
> to be signed but it appears compilers don't complain about that
> without -Wconversion (gcc doesn't even with -Wconversion which
> I find amusing) since I don't see any mention of -Wconversion
> in ffmpeg I can remove that cast too.
>
> > > + } else if (!strcmp(attr->name, "ws")) {
> > > + // TODO: Handle window styles
> >
> > use avpriv_report_missing_feature()?
> >
>
> I can do that but it'll trigger on many subtitle files as it's
> pretty commonly used, just doesn't usually change all that much
> which is why it's unimplemented.
>
> > > + if(textlen == 0)
> > > + continue;
> >
> > if (textlen...
> I assume you mean if(!textlen) and aren't suggesting to put
> everything into an if block?
actually just need to insert a space after the 'if'
> > > +static const AVOption options[] = {
> > > + { NULL }
> > > +};
> > > +
> > > +static const AVClass srv3_demuxer_class = {
> > > + .class_name = "SRV3 demuxer",
> > > + .option = options,
> > > + .version = LIBAVUTIL_VERSION_INT,
> > > +};
> >
> > are you planning to use options?
> > if not, options and srv3_demuxer_class can be removed.
>
> when I wrote this code I did have in mind a hypothetical
> option that would enable using non-standard libass functionality
> like BorderStyle = 3 so that's a possibility but I probably won't
> implement that myself anytime soon.
> can a class be added retroactively? if yes then yeah this
> should probably be removed.
yes class & options can be added later
-- Peter
(A907 E02F A6E5 0CD2 34CD 20D2 6760 79C5 AC40 DD6B)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20241228/798822ae/attachment.sig>
More information about the ffmpeg-devel
mailing list