[FFmpeg-devel] [RFC] Built-in documentation API
Jim DeLaHunt
list+ffmpeg-dev at jdlh.com
Mon Aug 24 06:38:42 EEST 2020
On 2020-08-23 08:21, Nicolas George wrote:
> Since the idea of documentation built in the libraries seems popular, I
> have tried to outline an API to access it.…
>
> See the attached file […`documentation.c` omitted…].
>
> The idea would be to have the build system convert the documentation
> into a C file with initialization for one or several AVDocNode
> structures.
>
> Note that since all this must be in .rodata, we must get it right on the
> first try, because of inter-libraries compatibility issues.
> …The most important question IMHO is which format we adopt for the doc in
> the library.…
Text is superficially simple, but in a multicultural world, text is in
reality very complex.
All text strings should have a character encoding defined. I suggest
that all the text fields be specified by the format as UTF-8 encoded. No
need to offer other options.
All human-readable strings should have their human language described.
Either define in the format that the string is written in the English
language (and decide if you want to require US or UK spelling), or add
language attributes to each text string identifying the human language
in which it is written (suggest using BCP 47[1] tags), or add a single
language attribute for the whole AVDocNode and require that all text
strings in that node be written in the same human language.
Assuming UTF-8 encoding, is `char *` the right data type? Does your
profile of the C language offer something more precisely targeted?
Something analogous to `std::string` of C++, perhaps?
Does this format allow documentation in multiple languages at the same
time? Might you ever want to ship an FFmpeg binary which has
documentation in, say, both English and Chinese?
Consider if you want to limit some text fields to a subset of UTF-8. For
instance, are the strings in the "Name" field limited to the ASCII
subset of UTF-8? Are emoji permitted?
What is the line wrapping model of these text objects? Are line endings
encoded with '\n' or '\r' or '\r\n' or any? What effect does '\t' have?
What about formfeed, or page eject?
Does this architecture permit markup which defines tables? How does it
display such markup?
This structure only stores marked-up text. Does that mean it is
impossible to store diagrams and pictures in the documentation? Are you
comfortable giving up that expressive power?
Will the overall documentation system be limited to the expressive power
of this mechanism? If not, then when you define the document compiler
which generates this format, you will need to define what gets done with
parts of the mechanism which this architecture cannot support. Are they
thrown out? Simplified somehow?
Does this structure permit markup with font choices? If the markup
calls for heading style, or italic, or preformatted style, how will the
display system invoke the correct fonts?
Font choices are also part of correctly displaying character style for
the language. The Unicode standard encodes Traditional Chinese,
Simplified Chinese, Japanese, and parts of Korean and Vietnamese with
unified Han codepoints. The text display uses a font choice to get the
correct character style for the language. Do you want to permit
documentation to appear in these languages with the correct character
style? How will that happen?
How will this API display text? Will it emit plain text with no
markup? Will it emit the internal markup language used by this data
structure (eg "FFMTHML") and not attempt to format it?
One risk of this architecture is that you are faced with a choice of
making a mechanism which is well-defined but limited (e.g. to English
and ASCII), or well-defined and terribly complex to define and to
implement, or simply designed and implemented, but poorly defined
outside of a core usage pattern. What is the value you are trying to
unlock with this architecture? How will you ensure this architecture
gives a positive return (value) on investment (design and implementation
and content authoring)?
[1] https://tools.ietf.org/html/bcp47
More information about the ffmpeg-devel
mailing list