[FFmpeg-devel] Embedded documentation?

Mon May 8 01:10:33 EEST 2023

On date Monday 2023-05-01 12:13:09 +0200, Nicolas George wrote:
> Hi.
> 
> Three years ago, I shared some brief thoughts about embedding the
> documentation in the libraries. For example, that would allow GUI
> applications to open help dialogs about specific options.
> 
> To see what it would need, I wrote the following header. I did not work
> any further, because groundwork need to be laid first. But now that it
> was mentioned in another thread, I think it is a good idea to show it,
> to see how people like it.
> 
> Please share your remarks. Even “+1” to say you like it, because people
> who will not like it will not hesitate to post “-1”.
> 
> Regards,
> 
> -- 
>   Nicolas George

> typedef struct AVDocNode AVDocNode;
> typedef struct AVDocLink AVDocLink;
> typedef enum AVDocNodeType AVDocNodeType;
> typedef enum AVDocLinkType AVDocLinkType;
> 
> /**
>  * Link to another documentation node.
>  */
> struct AVDocLink {
>     AVDocNode *target;
>     AVDocLinkType type;
> };
> 
> /**
>  * Node in the documentation system.
>  *
>  * A node can be the description of a codec, format, filter, option, type,
>  * etc.
>  */
> struct AVDocNode {
> 
>     /**
>      * Names of the component.
>      *
>      * It is a concatenation of 0-terminated strings, terminated by an empty
>      * string (i.e. a double 0).

>      * For example "frame_rate, rate, r" would be "frame_rate\0rate\0r\0\0".
>      * The first name is the main name of the component, the other are
>      * aliases.

I'd prefer to use something easily parsable and still more
human-readable, such as "frame_rate,rate,r"

>      * If this field is used as a plain C string, it contains only the main
>      * name.
>      */
>     const char *names;
> 
>     /**
>      * Unique identifier of the compnent.
>      *
>      * It is composed of alphanumeric characters plus underscore and slash
>      * and written hierarchically.
>      *

>      * For example, the width option of the scale filter would be
>      * "lavfi/vf_scale/opt_width".

maybe something as:
lavfi/scale/option:width

the name of a filter is unique, and we don't want to expose the
internals of the library (vf_, af_ etc. - also they don't make much
sense for multi/trans-media filters)

>      *
>      * This identifier can be used for links in the text.
>      *
>      * It matches the symbol that makes the documentation available, in the
>      * avdoc_ namespace with double underscore standing for slashes:
>      * extern const AVDocNode avdoc_lavfi__vf_scale__opt_width;
>      */
>     const char *id;
> 
>     /**
>      * Title / short description, possibly NULL.
>      *
>      * Can be used in a table of contents for example.
>      */
>     const char *title;
> 

>     /**
>      * Text of the documentation in XXX Markdown / FFHTML.
>      *
>      * Apparently we want to write the documentation in Markdown or similar,
>      * but the build system can convert when creating the data structure to
>      * embed in the library.
>      *
>      * On one hand, Markdown can be dumped as is to the user, in a terminal
>      * or a basic dialog box.
>      *
>      * On the other hand, strict minimalist HTML is more program-friendly,
>      * which makes it more convenient for programs that want to display it
>      * with actual italics 
>      *
>      * I think FFHTML (i.e. a small, strict and clearly documented subset of
>      * HTML) would be better.
>      */
>     const char *text;

I think the problem with HTML is that then you need to parse it if you
want to display it, so I'd tend to rather go with markdown:
1. it provides readable raw output
2. there are plenty of libraries which can render it to various
formats (including HTML)

> 
>     /**
>      * Object about which the documentation is.
>      *
>      * If not NULL, points to an object starting with an AVClass pointer.
>      */
>     void *object;
> 
>     /**
>      * Links towards other nodes.
>      *
>      * All nodes linked in the text must have an entry here, but implicit
>      * links are possible too, for example the type of an option.
>      *
>      * The types are ordered by type.
>      */
>     const AVDocLink *links;
> 

>     /**
>      * Type of the node, and of the object documented.
>      */
>     AVDocNodeType type;

We already define AV_CLASS_CATEGORY (libavutil/log.h), could it be
adjusted for this scope?

> };
> 
> /**
>  * Type of a documentation node.
>  */
> enum AVDocNodeType {
>     AVDOC_TYPE_GENERIC = 0,
>     AVDOC_TYPE_MUXER,
>     AVDOC_TYPE_DEMUXER,
>     AVDOC_TYPE_ENCODER,
>     AVDOC_TYPE_DECODER,
>     AVDOC_TYPE_FILTER,
>     AVDOC_TYPE_BITSTREAM_FILTER,
>     AVDOC_TYPE_SWSCALER,
>     AVDOC_TYPE_SWRESAMPLER,
>     AVDOC_TYPE_DEVICE_VIDEO_OUTPUT,
>     AVDOC_TYPE_DEVICE_VIDEO_INPUT,
>     AVDOC_TYPE_DEVICE_AUDIO_OUTPUT,
>     AVDOC_TYPE_DEVICE_AUDIO_INPUT,
>     AVDOC_TYPE_DEVICE_OUTPUT,
>     AVDOC_TYPE_DEVICE_INPUT,
>     AVDOC_TYPE_OPTION,
>     AVDOC_TYPE_TYPE,
>     AVDOC_TYPE_SYNTAX,
>     AVDOC_TYPE_EXAMPLES,
>     AVDOC_TYPE_EXPLANATIONS,
> };
> 
> /**
>  * Type of a link, i.e. relation between the source and the target of the
>  * link.
>  *
>  * More important links have a lower value.
>  */
> enum AVDocLinkType {
> 
>     /**
>      * The linked node is the parent.
>      *
>      * For example, the parent the node for a private option is the node for
>      * the corresponding codec/format/filter.
>      */
>     AVDOC_LINK_PARENT = 0x100,
> 
>     /**
>      * The linked node is a subpart, section, chapter, etc.
>      */
>     AVDOC_LINK_SUBPART = 0x200,
> 

>     /**
>      * The linked node describes an option or an option constant.
>      */
>     AVDOC_LINK_OPTION = 0x300,

>     /**
>      * Threshold value for the self-contained minimal documentation of an
>      * object.
>      */

I cannot parse this, where is the threshold value defined?

>     AVDOC_LINK_SELF_CONTAINED = 0x400,
> 
>     /**
>      * The linked node is the reference for a type, syntax, etc.
>      */
>     AVDOC_LINK_REFERENCE = 0x500,
> 
>     /**
>      * Threshold value for the self-contained complete documentation of an
>      * object, including refernce for types and syntaxes.
>      */
>     AVDOC_LINK_SELF_CONTAINED_FULL = 0x600,
> 

>     /**
>      * The linked node contains details and explanations.
>      */
>     AVDOC_LINK_DETAILS = 0x700,

Maybe an example would clarify this, since there is ambiguity about
what default and explanations are.

> };
> 
> typedef struct AVDocExcerpt AVDocExcerpt;
> typedef struct AVDocTocNode AVDocTocNode;
> 

> /**
>  * Excerpt of the documentation, structured and with links.
>  *
>  * Always returned by the library.
>  *
>  * This structure is the root of a tree of AVDocTocNode;
>  * siblings in the tree are structured as a linked list.
>  */
> struct AVDocExcerpt {
> 
>     /**
>      * First node of the excerpt
>      */
>     AVDocTocNode *begin;
> 
> };

isn't this redundant? can't you just use AVDocTocNode?

> /**
>  * Node in an excerpt of the documentation.
>  *
>  * This contains the node and its relation with other nodes.
>  */
> struct AVDocTocNode {

I dislike this name, as it is not descriptive at all (I assume Toc
stands for Table Of Contents). Maybe AVDocNodeContext?

> 
>     AVDocNode *node;
> 
>     AVDocTocNode *next;
> 
>     AVDocTocNode *first_child;
> 
> };

> /**
>  * Get the documentation node associated with an object, if any.
>  *
>  * obj must point to an object starting with an AVClass pointer.
>  */
> const AVDocNode *av_documentation_get_node(void *obj);
> 
> /**
>  * Get an excerpt of the documentation around a node.
>  *
>  * @param excerpt    used to return the excerpt
>  * @param nodes      nodes to document;
>  *                   if one is NULL, it is skipped;
>  *                   if all are NULL, a dummy documentation is returned
>  * @param nb_nodes   number of nodes
>  * @param threshold  limit of the excerpt, it will contain all nodes pointed
>  *                   by links that are below the threshold, recursively;
>  *                   AVDOC_LINK_SELF_CONTAINED and
>  *                   AVDOC_LINK_SELF_CONTAINED_FULL are useful values.
>  * @return  >= 0 for success or an AVERROR code, possibly AVERROR(ENOMEM)
>  */
> int av_documentation_get_excerpt(AVDocExcerpt **excerpt,
>                                  const AVDocNode **node, unsigned nb_nodes,
>                                  AVDocLinkType threshold);
> 
> typedef enum AVDocFormat {
>     AV_DOC_FORMAT_HTML,
>     AV_DOC_FORMAT_MARKDOWN,
> };
> 
> #define AV_DOC_FORMAT_FLAG_TOC 0x0001
> 
> /**
>  * Serialize a documentation excerpt.
>  */
> void av_documentation_write(AVWriter wr, AVDocExcerpt **excerpt,
>                             AVDocFormat format, unsigned flags);

Can you share more details about the plan? In particular, is the doc
going to be embedded in the code itself (e.g. in the C
implementation)? Or should we have some dedicated headers containing
the docs?

We should also avoid to duplicate the same information between docs
and code, so there should be some way to autogenerate the docs from
the corresponding entries in the code.

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".