[FFmpeg-devel] [PATCH v4 1/4] doc: Explain what "context" means
Andrew Sayers
ffmpeg-devel at pileofstuff.org
Wed May 22 19:07:51 EEST 2024
On Wed, May 22, 2024 at 11:31:52AM +0200, Stefano Sabatini wrote:
> Sorry for the slow reply.
Welcome back :)
I've gathered some critiques of my own over the past week, which I'll pepper
throughout the reply. Starting with...
The document assumes (or is at least designed to be secure against) readers
starting at the top and reading through to the bottom. I found doxygen's
@tableofcontents command while writing this e-mail, which I will definitely
use in the next version, and which might provoke a rewrite aimed at people
jumping around the document looking for answers to specific questions.
>
> On date Wednesday 2024-05-15 16:54:19 +0100, Andrew Sayers wrote:
> > Derived from detailed explanations kindly provided by Stefano Sabatini:
> > https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
> > ---
> > doc/context.md | 394 +++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 394 insertions(+)
> > create mode 100644 doc/context.md
> >
> > diff --git a/doc/context.md b/doc/context.md
> > new file mode 100644
> > index 0000000000..fb85b3f366
> > --- /dev/null
> > +++ b/doc/context.md
> > @@ -0,0 +1,394 @@
> > +# Introduction to contexts
> > +
> > +“%Context”
>
> Is this style of quoting needed? Especially I'd avoid special markup
> to simplify unredendered text reading (which is the point of markdown
> afterall).
Short answer: I'll change it in the next patch and see what happens.
Long answer: HTML quotes are ugly for everyone, UTF-8 is great until someone
turns up complaining we broke their Latin-1 workflow. I've always preferred
ASCII-only representations for that reason, but happy to try the other way
and see if anyone still cares.
>
> > is a name for a widely-used programming idiom.
>
> > +This document explains the general idiom and the conventions FFmpeg has built around it.
> > +
> > +This document uses object-oriented analogies to help readers familiar with
> > +[object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming)
> > +learn about contexts. But contexts can also be used outside of OOP,
> > +and even in situations where OOP isn't helpful. So these analogies
> > +should only be used as a first step towards understanding contexts.
> > +
> > +## “Context” as a way to think about code
> > +
> > +A context is any data structure that is passed to several functions
> > +(or several instances of the same function) that all operate on the same entity.
> > +For example, [object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming)
> > +languages usually provide member functions with a `this` or `self` value:
> > +
>
> > +```c
> > +class my_cxx_class {
> > + void my_member_function() {
> > + // the implicit object parameter provides context for the member function:
> > + std::cout << this;
> > + }
> > +};
> > +```
>
> I'm not convinced this is really useful: if you know C++ this is
> redundant, if you don't this is confusing and don't add much information.
The example is there to break up a wall of text (syntax-highlighted in the
rendered output), and to let the reader know that this is going to be one of
those documents that alternates between text and code, so they're ready for the
more substantive examples later on. I take the point about C++ though -
would this Python example be more readable?
class MyClass:
def my_func(self):
# If a Python function is part of a class,
# its first parameter must be an instance of that class
>
> > +
> > +Contexts are a fundamental building block of OOP, but can also be used in procedural code.
>
> I'd drop this line, and drop the anchor on OOP at the same time since
> it's adding no much information.
Fundamentally, this document addresses two audiences:
1. people coming from a non-OOP background, who want to learn contexts
from first principles, and at best see OOP stuff as background information
2. people coming from an OOP background. There's no polite way to say this -
their incentive is to write FFmpeg off as a failed attempt at OOP, so they
don't have to learn a new way of working that's just different enough to
make them feel dumb
I think a good way to evaluate the document might be to read it through twice,
stopping after each paragraph to ask two unfair questions...
1. what has this told me about FFmpeg itself, as opposed to some other thing
you wish I cared about?
2. couldn't you have just done this the standard OOP way?
The earlier paragraph acknowledged that contexts resemble OOP (telling the OOP
audience we get it), then this paragraph adds "but they're not the same"
(telling the OOP audience we disagree). To be more useful to non-OOP folk,
how about:
Contexts can be a fundamental building block of OOP, but can also be used in
procedural projects like FFmpeg.
>
> > +For example, most callback functions can be understood to use contexts:
>
> > +
> > +```c
> > +struct MyStruct {
> > + int counter;
> > +};
> > +
> > +void my_callback( void *my_var_ ) {
> > + // my_var provides context for the callback function:
> > + struct MyStruct *my_var = (struct MyStruct *)my_var_;
> > + printf("Called %d time(s)", ++my_var->counter);
> > +}
> > +
> > +void init() {
> > + struct MyStruct my_var;
> > + my_var.counter = 0;
> > + register_callback( my_callback, &my_var );
>
> style: fun(my_callback, ...) (so spaces around parentheses) here and
> below
🫡
... no wait I just said Unicode is bad ...
I mean, will do.
>
> > +}
> > +```
> > +
> > +In the broadest sense, “context” is just a way to think about code.
> > +You can even use it to think about code written by people who have never
> > +heard the term, or who would disagree with you about what it means.
> > +
> > +## “Context” as a tool of communication
> > +
> > +“%Context“ can just be a word to understand code in your own head,
> > +but it can also be a term you use to explain your interfaces.
> > +Here is a version of the callback example that makes the context explicit:
> > +
> > +```c
> > +struct CallbackContext {
> > + int counter;
> > +};
> > +
> > +void my_callback( void *ctx_ ) {
> > + // ctx provides context for the callback function:
> > + struct CallbackContext *ctx = (struct CallbackContext *)ctx_;
> > + printf("Called %d time(s)", ++ctx->counter);
> > +}
> > +
> > +void init() {
> > + struct CallbackContext ctx;
> > + ctx.counter = 0;
> > + register_callback( my_callback, &ctx );
> > +}
> > +```
> > +
> > +The difference here is subtle, but important. If a piece of code
> > +*appears compatible with contexts*, then you are *allowed to think
> > +that way*, but if a piece of code *explicitly states it uses
> > +contexts*, then you are *required to follow that approach*.
> > +
>
> > +For example, imagine someone modified `MyStruct` in the earlier example
> > +to count several unrelated events across the whole program. That would mean
> > +it contained information about multiple entities, so was not a context.
> > +But nobody ever *said* it was a context, so that isn't necessarily wrong.
> > +However, proposing the same change to the `CallbackContext` in the later example
> > +would violate a guarantee, and should be pointed out in a code review.
> > +
>
> I'm not very convinced by the callback example. The use of contexts in
> the FFmpeg API is very much simpler, it is used to keep track of
> configuration and state (that is they track the "object" where to
> operate on), so the callback example here is a bit misleading.
>
> Callbacks are used in the internals to implement different elements
> (codecs, protocols, filters, etc...) implementing a common API, but in
> this case the relation with "contexts" is less straightforward.
I go back and forth on this one, but your point made me think about it
in a new way...
AVIOContext::read_packet is a callback function, and a reader who has just
learnt about contexts would naturally assume we intend its first argument
to be interpreted as a context. Given that new readers are likely to learn
avio_alloc_context() around the same time as reading this document,
it's important we give them the tools to understand that function.
How about changing the topmost callback example to read data from a FILE*
(without mentioning AVIOContext), then emphasising how you can think of it
as a context despite not following FFmpeg's rules, then finally mentioning
how you could pass the callback to avio_alloc_context() if you wanted?
>
> > + at warning Guaranteeing to use contexts does not mean guaranteeing to use
> > +object-oriented programming. For example, FFmpeg creates its contexts
> > +procedurally instead of with constructors.
>
> I'm afraid this is more confusing than helpful, since the FFmpeg API
> is not OOP. I'd drop this sentence.
My concern is that if an OOP reader asks "couldn't you have just done this
the standard OOP way?", they will be tempted to answer "oh, so you *used to*
fail at OOP but nowadays you promise to do it right", and not bother reading
any further. So there needs to be something eye-catching here, but yes this
paragraph needs to be more useful to non-OOP readers.
This will probably need to be rewritten based on the callback discussion,
so I'll think about ways to change this at the same time.
>
> > +
> > +## Contexts in the real world
> > +
> > +To understand how contexts are used in the real world, it might be
> > +useful to compare [curl's MD5 hash context](https://github.com/curl/curl/blob/bbeeccdea8507ff50efca70a0b33d28aef720267/lib/curl_md5.h#L48)
> > +with @ref AVMD5 "FFmpeg's equivalent context".
> > +
>
> > +The [MD5 algorithm](https://en.wikipedia.org/wiki/MD5) produces
> > +a fixed-length digest from arbitrary-length data. It does this by calculating
> > +the digest for a prefix of the data, then loading the next part and adding it
> > +to the previous digest, and so on. Projects that use MD5 generally use some
> > +kind of context, so comparing them can reveal differences between projects.
> > +
> > +```c
> > +// Curl's MD5 context looks like this:
> > +struct MD5_context {
> > + const struct MD5_params *md5_hash; /* Hash function definition */
> > + void *md5_hashctx; /* Hash function context */
> > +};
> > +
> > +// FFmpeg's MD5 context looks like this:
> > +typedef struct AVMD5 {
> > + uint64_t len;
> > + uint8_t block[64];
> > + uint32_t ABCD[4];
> > +} AVMD5;
> > +```
> > +
> > +Curl's struct name ends with `_context`, guaranteeing contexts are the correct
> > +interpretation. FFmpeg's struct does not explicitly say it's a context, but
> > + at ref libavutil/md5.c "its functions do" so we can reasonably assume
> > +it's the intended interpretation.
> > +
> > +Curl's struct uses `void *md5_hashctx` to avoid guaranteeing
> > +implementation details in the public interface, whereas FFmpeg makes
> > +everything accessible. This kind of data hiding is an advanced context-oriented
> > +convention, and is discussed below. Using it in this case has strengths and
> > +weaknesses. On one hand, it means changing the layout in a future version
> > +of curl won't break downstream programs that used that data. On the other hand,
> > +the MD5 algorithm has been stable for 30 years, so it's arguably more important
> > +to let people dig in when debugging their own code.
> > +
> > +Curl's struct is declared as `struct <type> { ... }`, whereas FFmpeg uses
> > +`typedef struct <type> { ... } <type>`. These conventions are used with both
> > +context and non-context structs, so don't say anything about contexts as such.
> > +Specifically, FFmpeg's convention is a workaround for an issue with C grammar:
> > +
> > +```c
> > +void my_function( ... ) {
> > + int my_var; // good
> > + MD5_context my_curl_ctx; // error: C needs you to explicitly say "struct"
> > + struct MD5_context my_curl_ctx; // good: added "struct"
> > + AVMD5 my_ffmpeg_ctx; // good: typedef's avoid the need for "struct"
> > +}
> > +```
> > +
> > +Both MD5 implementations are long-tested, widely-used examples of contexts
> > +in the real world. They show how contexts can solve the same problem
> > +in different ways.
>
> I'm concerned that this is adding more information than really
> needed. Especially comparing with internals of curl means that now the
> docs needs to be kept in synch also with the curl's API, meaning that
> it will be outdated very soon. I'd rather drop the curl comparison
> altogether.
The value of this section depends on the reader...
This tells a non-OOP reader that FFmpeg didn't invent contexts, and reasonable
people can disagree about what they mean. So when we get into the ambiguous
cases later on, they have a better idea about which things are just how
contexts work, and which are specifically how FFmpeg uses them.
This tells an OOP reader that it's not "OOP standard, FFmpeg non-standard",
it's that FFmpeg is using a C standard that's not used in OOP languages.
Curl's `MD5_context` was last modified in 2020 (interestingly, to get rid of
the `typedef struct` trick). It's a slow-moving target, but you're right it's
not a static one.
I'd argue the above means this should be *somewhere* in the document, but not
necessarily here in the middle. I'll see if it works better as a paragraph
here and an appendix or something at the bottom.
> > +
> > +## FFmpeg's advanced context-oriented conventions
> > +
> > +Projects that make heavy use of contexts tend to develop conventions
> > +to make them more useful. This section discusses conventions used in FFmpeg,
> > +some of which are used in other projects, others are unique to this project.
> > +
> > +### Naming: “Context” and “ctx”
> > +
> > +```c
> > +// Context struct names usually end with `Context`:
> > +struct AVSomeContext {
> > + ...
> > +};
> > +
> > +// Functions are usually named after their context,
> > +// context parameters usually come first and are often called `ctx`:
> > +void av_some_function( AVSomeContext *ctx, ... );
> > +```
> > +
> > +If an FFmpeg struct is intended for use as a context, its name usually
> > +makes that clear. Exceptions to this rule include AVMD5 (discussed above),
> > +which is only identified as a context by the functions that call it.
> > +
> > +If a function is associated with a context, its name usually
> > +begins with some variant of the context name (e.g. av_md5_alloc()
> > +or avcodec_alloc_context3()). Exceptions to this rule include
> > + at ref avformat.h "AVFormatContext's functions", many of which
> > +begin with just `av_`.
> > +
> > +If a function has a context parameter, it usually comes first and its name
> > +often contains `ctx`. Exceptions include av_bsf_alloc(), which puts the
> > +context argument second to emphasise it's an out variable.
> > +
> > +### Data hiding: private contexts
> > +
> > +```c
> > +// Context structs often hide private context:
> > +struct AVSomeContext {
> > + void *priv_data; // sometimes just called "internal"
> > +};
> > +```
> > +
> > +Contexts usually present a public interface, so changing a context's members
> > +forces everyone that uses the library to at least recompile their program,
> > +if not rewrite it to remain compatible. Hiding information in a private context
> > +ensures it can be modified without affecting downstream software.
> > +
> > +Object-oriented programmers may be tempted to compare private contexts to
> > +*private class members*. That's often accurate, but for example it can also
> > +be used like a *virtual function table* - a list of functions that are
> > +guaranteed to exist, but may be implemented differently for different
> > +sub-classes. When thinking about private contexts, remember that FFmpeg
> > +isn't *large enough* to need some common OOP techniques, even though it's
> > +solving a problem that's *complex enough* to benefit from some rarer techniques.
> > +
> > +### Manage lifetime: allocate, initialize and free
> > +
> > +```c
> > +void my_function( ... ) {
> > +
> > + // Context structs are allocated then initialized with associated functions:
> > +
> > + AVSomeContext *ctx = av_some_context_alloc( ... );
> > +
> > + // ... configure ctx ...
> > +
> > + av_some_context_init( ctx, ... );
> > +
> > + // ... use ctx ...
> > +
> > + // Context structs are freed with associated functions:
> > +
> > + av_some_context_free( ctx );
> > +
> > +}
> > +```
> > +
> > +FFmpeg contexts go through the following stages of life:
> > +
> > +1. allocation (often a function that ends with `_alloc`)
> > + * a range of memory is allocated for use by the structure
> > + * memory is allocated on boundaries that improve caching
> > + * memory is reset to zeroes, some internal structures may be initialized
> > +2. configuration (implemented by setting values directly on the object)
> > + * no function for this - calling code populates the structure directly
> > + * memory is populated with useful values
> > + * simple contexts can skip this stage
> > +3. initialization (often a function that ends with `_init`)
> > + * setup actions are performed based on the configuration (e.g. opening files)
> > +5. normal usage
> > + * most functions are called in this stage
> > + * documentation implies some members are now read-only (or not used at all)
> > + * some contexts allow re-initialization
> > +6. closing (often a function that ends with `_close()`)
> > + * teardown actions are performed (e.g. closing files)
> > +7. deallocation (often a function that ends with `_free()`)
> > + * memory is returned to the pool of available memory
> > +
> > +This can mislead object-oriented programmers, who expect something more like:
> > +
> > +1. allocation (usually a `new` keyword)
> > + * a range of memory is allocated for use by the structure
> > + * memory *may* be reset (e.g. for security reasons)
> > +2. initialization (usually a constructor)
> > + * memory is populated with useful values
> > + * related setup actions are performed based on arguments (e.g. opening files)
> > +3. normal usage
> > + * most functions are called in this stage
> > + * compiler enforces that some members are read-only (or private)
> > + * no going back to the previous stage
> > +4. finalization (usually a destructor)
> > + * teardown actions are performed (e.g. closing files)
> > +5. deallocation (usually a `delete` keyword)
> > + * memory is returned to the pool of available memory
> > +
> > +FFmpeg's allocation stage is broadly similar to OOP, but can do some higher-level
> > +operations. For example, AVOptions-enabled structs (discussed below) contain an
> > +AVClass member that is set during allocation.
> > +
> > +FFmpeg's "configuration" and "initialization" stages combine to resemble OOP's
> > +"initialization" stage. This can mislead object-oriented developers,
> > +who are used to doing both at once. This means FFmpeg contexts don't have
> > +a direct equivalent of OOP constructors, as they would be doing
> > +two jobs in one function.
> > +
> > +FFmpeg's three-stage creation process is useful for complicated structures.
> > +For example, AVCodecContext contains many members that *can* be set before
> > +initialization, but in practice most programs set few if any of them.
> > +Implementing this with a constructor would involve a function with a list
> > +of arguments that was extremely long and changed whenever the struct was
> > +updated. For contexts that don't need the extra flexibility, FFmpeg usually
> > +provides a combined allocator and initializer function. For historical reasons,
> > +suffixes like `_alloc`, `_init`, `_alloc_context` and even `_open` can indicate
> > +the function does any combination of allocation and initialization.
> > +
> > +FFmpeg's "closing" stage is broadly similar to OOP's "finalization" stage,
> > +but some contexts allow re-initialization after finalization. For example,
> > +SwrContext lets you call swr_close() then swr_init() to reuse a context.
> > +
> > +FFmpeg's "deallocation" stage is broadly similar to OOP, but can perform some
> > +higher-level functions (similar to the allocation stage).
> > +
> > +Very few contexts need the flexibility of separate "closing" and
> > +"deallocation" stages, so these are usually combined into a single function.
> > +Closing functions usually end with "_close", while deallocation
> > +functions usually end with "_free".
Some changes I'm planning to make in the next rewrite...
* some functions happen to end with "_finalize()" (e.g. av_bsf_list_finalize()),
but it's just a coincidence that they share a name with the OOP stage
* I think that e.g. av_mediacodec_default_free() needs to be called immediately
before (or after?) closing the associated context, which would make it a
separate "deconfiguration" stage according to the logic above.
> > +
> > +### Reflection: AVOptions-enabled structs
> > +
> > +Object-oriented programming puts more focus on data hiding than FFmpeg needs,
> > +but it also puts less focus on
> > +[reflection](https://en.wikipedia.org/wiki/Reflection_(computer_programming)).
> > +
> > +To understand FFmpeg's reflection requirements, run `ffmpeg -h full` on the
> > +command-line, then ask yourself how you would implement all those options
> > +with the C standard [`getopt` function](https://en.wikipedia.org/wiki/Getopt).
> > +You can also ask the same question for any other programming languages you know.
> > +[Python's argparse module](https://docs.python.org/3/library/argparse.html)
> > +is a good example - its approach works well with far more complex programs
> > +than `getopt`, but would you like to maintain an argparse implementation
> > +with 15,000 options and growing?
> > +
> > +Most solutions assume you can just put all options in a single block,
> > +which is unworkable at FFmpeg's scale. Instead, we split configuration
> > +across many *AVOptions-enabled structs*, which use the @ref avoptions
> > +"AVOptions API" to reflect information about their user-configurable members,
> > +including members in private contexts.
> > +
> > +An *AVOptions-enabled struct* is a struct that contains an AVClass element as
> > +its first member, and uses that element to provide access to instances of
> > +AVOption, each of which provides information about a single option.
> > +The AVClass can also include more @ref AVClass "AVClasses" for private contexts,
> > +making it possible to set options through the API that aren't
> > +accessible directly.
> > +
> > +AVOptions-accessible members of a context should be accessed through the
> > +AVOptions API whenever possible, even if they're not hidden away in a private
> > +context. That ensures values are validated as they're set, and means you won't
> > +have to do as much work if a future version of FFmpeg changes the layout.
> > +
> > +AVClass was created very early in FFmpeg's history, long before AVOptions.
> > +Its name suggests some kind of relationship to an OOP
> > +base [class](https://en.wikipedia.org/wiki/Class_(computer_programming)),
> > +but the name has become less accurate as FFmpeg evolved, to the point where
> > +AVClass and AVOption are largely synonymous in modern usage. The difference
> > +might still matter if you need to support old versions of FFmpeg,
> > +where you might find *AVClass context structures* (contain an AVClass element
> > +as their first member) that are not *AVOptions-enabled* (don't use that element
> > +to provide access to instances of AVOption).
One more note I'm planning for the next rewrite...
* as you've mentioned elsewhere, AVClass is involved in formatting log messages,
so if you e.g. make your own logging framework, it might help to think about
AVClass context structures as distinct from from AVOptions-enabled structs
> > +Object-oriented programmers may be tempted to compare @ref avoptions "AVOptions"
> > +to OOP getters and setters. There is some overlap in functionality, but OOP
> > +getters and setters are usually specific to a single member and don't provide
> > +metadata about the member; whereas AVOptions has a single API that covers
> > +every option, and provides help text etc. as well.
> > +
> > +Object-oriented programmers may be tempted to compare AVOptions-accessible
> > +members of a public context to protected members of a class. Both provide
> > +global access through an API, and unrestricted access for trusted friends.
> > +But this is just a happy accident, not a guarantee.
>
> This part looks fine, although there is too much OOP jargon for my
> taste: this would make reading for programmers not familiar with OOP
> harder than needed since she will miss many references.
>
> > +
> > +## Final example: context for a codec
> > +
> > +AVCodecContext is an AVOptions-enabled struct that contains information
> > +about encoding or decoding one stream of data (e.g. the video in a movie).
> > +It's a good example of many of the issues above.
> > +
> > +The name "AVCodecContext" tells us this is a context. Many of
> > + at ref libavcodec/avcodec.h "its functions" start with an `avctx` parameter,
> > +indicating this object provides context for that function.
> > +
> > +AVCodecContext::internal contains the private context. For example,
> > +codec-specific information might be stored here.
> > +
> > +AVCodecContext is allocated with avcodec_alloc_context3(), initialized with
> > +avcodec_open2(), and freed with avcodec_free_context(). Most of its members
> > +are configured with the @ref avoptions "AVOptions API", but for example you
> > +can set AVCodecContext::opaque or AVCodecContext::draw_horiz_band() if your
> > +program happens to need them.
> > +
> > +AVCodecContext provides an abstract interface to many different *codecs*.
> > +Options supported by many codecs (e.g. "bitrate") are kept in AVCodecContext
> > +and reflected as AVOptions. Options that are specific to one codec are
> > +stored in the internal context, and reflected from there.
> > +
> > +To support a specific codec, AVCodecContext's private context is set to
> > +an encoder-specific data type. For example, the video codec
> > +[H.264](https://en.wikipedia.org/wiki/Advanced_Video_Coding) is supported via
> > +[the x264 library](https://www.videolan.org/developers/x264.html), and
> > +implemented in X264Context.
>
> > Although included in the documentation, X264Context is not part of the public API.
>
> Why included in the doc? That is a private struct and therefore should
> not be included in the doxy.
Doxygen doesn't provide an obvious mechanism to include only the public API.
Changing that would be at best a big job, and it isn't obvious where to even
start until/unless e.g. [1] gets merged in. It seems like a better plan
to put the warning in and take it out if and when the site gets updated.
[1] https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/326031.html
More information about the ffmpeg-devel
mailing list