[FFmpeg-devel] [PATCH v4 1/4] avutil/{avstring, bprint}: add XML escaping from ffprobe to avutil

Jan Ekström jeebjp at gmail.com
Mon Feb 1 13:40:26 EET 2021


On Mon, Jan 25, 2021 at 2:47 PM Nicolas George <george at nsup.org> wrote:
>
> Jan Ekström (12021-01-25):
> > For now I kept them separate since one is just moving Stefano's code,
> > and another adds new functionality that you requested.
>
> Stefano's code was good enough for ffprobe, because it had to perform a
> very precise task in a very constrained context, but as is, it is not
> good enough for lavu, it needs to be adapted.
>

I beg to differ, since it seemed to fully implement the cases noted in
2.4. Does escape double and single quotes unnecessarily in some cases,
yes. But not invalid.

But at this point I have wasted enough time with parts of this patch
set that I do not care about. I do wish that I had said to you earlier
that if you want to rework this stuff, you could do it yourself. I
just noticed the existing code and attempted to do the Right Thing and
move it into general code. I bet if I had not done it, nobody would
have bat an eye regarding it, since that's exactly how the existing
XML output things do things right now. I hope it is understandable why
it is annoying to me that instead of the code that I actually proposed
to add, the stuff that was *already* *there*s is the stuff that gets
now picked out.

> > I really have no bigger bone with either naming, I just utilized:
> > 1. How the XML spec calls the thing ("AttValue" thus becoming
> > ATT_VALUE, "CharData" becoming CHAR_DATA). I slightly would prefer to
> > keep this due to this making it easy to point towards what is actually
> > meant in the XML spec.
>
> CharData is only used a few times in the abstract syntax of the spec, it
> will have meaning only for people who are very familiar with it.
>

It also contains a nice definition of it in 2.4

"[Definition: All text that is not markup constitutes the character
data of the document.] "

> > 2. Single and double quotes are the wording that I am more acquainted
> > with, but I have no hard opinion so if you want _QUOT/_APOS, sure.
>
> I am fine with "double_quotes" and "single_quotes" or "dquot" and
> "squot" too.
>

This is where (for now) I go through the route of least resistance and post a v5

https://github.com/jeeb/ffmpeg/commits/ttml_encoder_v5

- I changed "quoted" to "quotes".
- I kept the names matching to the spec, esp. due to the definitions
in the spec in 2.3 and 2.4, as well as kept the fact that it is
escaping for the actual value of the attribute, not just "the
attribute".
- I removed the double and single quote escaping from the CHAR_DATA
escaping, as you requested.

For the record, I have no great interest in XML. If you wish to take
this stuff further, please do so. But at this point I am feeling
rather stumped that after the initial reviews which actually were
concerned of code that I wrote, most of the comments have been
regarding things that were already there in the code base.

Best regards,
Jan


More information about the ffmpeg-devel mailing list