[FFmpeg-devel] [PATCH v4 1/4] doc: Explain what "context" means

Andrew Sayers ffmpeg-devel at pileofstuff.org
Mon Apr 29 12:24:04 EEST 2024


Derived from detailed explanations kindly provided by Stefano Sabatini:
https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
---
 doc/context.md | 308 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 308 insertions(+)
 create mode 100644 doc/context.md

diff --git a/doc/context.md b/doc/context.md
new file mode 100644
index 0000000000..73297f53aa
--- /dev/null
+++ b/doc/context.md
@@ -0,0 +1,308 @@
+# Introduction to contexts
+
+Like many C projects, FFmpeg has adopted the subset of
+[object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming)
+techniques that help solve its problems.  Object-like structures are called "contexts",
+and this document provides a general introduction to how they work.
+
+Object-oriented programming tends to focus on
+[access](https://en.wikipedia.org/wiki/Access_modifiers) as a primary concern.
+For example, members of a class are often designated "private" to indicate
+they are only accessible to code that's part of the class.  Less focus is put on
+[reflection](https://en.wikipedia.org/wiki/Reflection_(computer_programming)),
+where it is provided at all.  For example, C++ has no built-in way to serialize
+the state of an arbitrary object.
+
+Reflection is extremely important for FFmpeg, because user-facing
+options are implemented by reflecting the state of contexts.  Limiting
+access is a secondary concern, mainly important for ensuring
+implementation details can change between versions.
+
+Knowledge of object-oriented programming concepts can help when learning FFmpeg,
+but it's important not to fixate on FFmpeg's access control features,
+nor to overlook its reflection capabilities.  This document compares
+FFmpeg and OOP techniques where relevant, but does not require an understanding
+of object-oriented programming.
+
+## Example: modify text then print it
+
+The example below shows a context structure that receives input strings,
+modifies them in some context-dependant way, then prints them to a specified
+filehandle.
+
+```c
+/**
+ * Type information, accessible at runtime.
+ *
+ * Useful when configuring objects.
+ */
+enum ModifyThenPrintDialect {
+    MODIFY_THEN_PRINT_DIALECT_PLAIN_TEXT = 0,
+    MODIFY_THEN_PRINT_DIALECT_REGEX      = 1,
+    MODIFY_THEN_PRINT_DIALECT_REGEX_PCRE = 2
+};
+
+/**
+ * User-facing information about types
+ *
+ * Useful for describing contexts to the user.
+ */
+static const char *ModifyThenPrintDialectName[] = {
+    "plain text",
+    "regular expression",
+    "Perl-compatible regular expression"
+};
+
+/**
+ * Context for functions that modify strings before printing them
+ */
+struct ModifyThenPrintContext {
+
+    /**
+     * Information about the type of this particular instance
+     *
+     * Object-oriented programs would probably replace this with "sub-classes",
+     * where each class extends the base class to implement one dialect.
+     * But information about a type can also work more like "mixins",
+     * where several pieces of unrelated functionality are blended together
+     * to create an interface for a specific purpose.
+     */
+    enum ModifyThenPrintDialect dialect;
+
+    /**
+     * Internal context
+     *
+     * Contains anything that isn't part of the public interface.
+     * The most obvious OOP analogy is to "private" members that are only
+     * accessible to code that's part of the class.  But it could also contain
+     * information about "virtual functions" - functions that every sub-class
+     * guarantees to implement, but in their own class-specific way.
+     */
+    void *priv_data;
+
+    /**
+     * User-configurable options
+     *
+     * Best set through an API, but can be set directly if necessary
+     *
+     * Data from users needs to be validated before it's set, and the API
+     * might e.g. want to update some internal buffer for performance reasons.
+     * Setting these directly is always less robust than using an API,
+     * but might be worthwhile if you're willing to spend the time checking
+     * for edge cases, and don't mind your code misbehaving if future
+     * versions change the API but not the structure.
+     *
+     * Object-oriented programs would likely make these "protected" members,
+     * initialised in a constructor and accessed with getters and setters.
+     * That means code *related* to this class can access them directly,
+     * while unrelated code accesses them indirectly by calling functions.
+     *
+     * But the "protected access" analogy is quite limited. In particular,
+     * protected members don't have any special reflective properties,
+     * whereas FFmpeg options are usually configurable by end users.
+     */
+    char *replace_this;
+    char *with_this;
+
+    /**
+     * Programmer-configurable variable
+     *
+     * Object-oriented programs would represent this as a public member.
+     */
+    FILE *out;
+
+};
+
+/**
+ * Allocate and initialize a ModifyThenPrintContext
+ *
+ * Creates a new pointer, then fills in some sensible defaults.
+ *
+ * We can reasonably assume this function will initialise `priv_data`
+ * with a dialect-specific object, but shouldn't make any assumptions
+ * about what that object is.
+ *
+ * Object-oriented programs would likely implement this with "allocator"
+ * and "constructor" functions, (i.e. separate functions that reserve memory
+ * for an object and set its initial values).
+ */
+int ModifyThenPrintContext_alloc_context(struct ModifyThenPrintContext **ctx,
+                                         enum ModifyThenPrintDialect dialect,
+                                         FILE *out);
+
+/**
+ * Uninitialize and deallocate a ModifyThenPrintContext
+ *
+ * Does any work required by the internal context (e.g. deallocating
+ * `priv_data`), then deallocates the main context itself.
+ *
+ * Object-oriented programs would likely implement this with
+ * destructor and deallocator functions (i.e. separate functions
+ * that do the opposite of the allocator and constructor above).
+ */
+int ModifyThenPrintContext_free(struct ModifyThenPrintContext *ctx);
+
+/**
+ * Configure a ModifyThenPrintContext
+ *
+ * Checks that the arguments are valid in the context's dialect,
+ * then updates the options as necessary
+ *
+ * Object-oriented programs would likely represent this as a
+ * pair of "setter" functions (functions whose only job is to
+ * validate and update a value).
+ */
+int ModifyThenPrintContext_configure(struct ModifyThenPrintContext *ctx,
+                                     char *replace_this,
+                                     char *with_this);
+
+/**
+ * Print a single message
+ *
+ * Object-oriented programs would likely represent this with an ordinary
+ * member function (a function called like `context->function(...)`
+ * instead of `function( context, ... )`).
+ */
+int ModifyThenPrintContext_print(struct ModifyThenPrintContext *ctx,
+                                 char *msg);
+
+/**
+ * Print the contents of a ModifyThenPrintContext to a filehandle
+ *
+ * Provides human-readable information about keys and values.
+ *
+ * Object-oriented programs would likely represent this with some kind of
+ * `serialize` function (a function that iterates through all members,
+ * printing each one's name and contents to the file).  These can be member
+ * functions, but in practice many languages implement it some other way -
+ * an example of the lack of focus on reflection in object-oriented languages.
+ */
+int ModifyThenPrintContext_dump(struct ModifyThenPrintContext **ctx,
+                                FILE *dump_fh);
+
+/**
+ * How this context might be used in practice
+ */
+int print_hello_world()
+{
+
+    int ret = 0;
+
+    struct ModifyThenPrintContext *ctx;
+
+    if ( ModifyThenPrintContext_alloc_context( &ctx, MODIFY_THEN_PRINT_DIALECT_REGEX, stdout ) < 0 ) {
+        ret = -1;
+        goto EXIT_WITHOUT_CLEANUP;
+    }
+
+    if ( ModifyThenPrintContext_configure(ctx, "Hi|Hullo", "Hello") < 0 ) {
+        ret = -1;
+        goto FINISH;
+    }
+
+    if ( ModifyThenPrintContext_print( ctx, "Hi, world!\n" ) < 0 ) {
+        ret = -1;
+        goto FINISH;
+    }
+
+    if ( ModifyThenPrintContext_print( ctx, "Hullo, world!\n" ) < 0 ) {
+        ret = -1;
+        goto FINISH;
+    }
+
+    FINISH:
+    if ( ModifyThenPrintContext_free( ctx ) ) {
+        ret = -1;
+        goto EXIT_WITHOUT_CLEANUP;
+    }
+
+    EXIT_WITHOUT_CLEANUP:
+    return ret;
+
+}
+```
+
+## FFmpeg context structures
+
+The example above shows a generic context structure and its associated
+functions.  Some FFmpeg contexts are no more complex than that example,
+just as some objects are just key/value stores with some functions attached.
+But here are some examples to show the variety of contexts available in FFmpeg.
+
+AVHashContext presents a generic API for hashing data.  @ref hash.h
+"Its associated functions" show how to create, use and destroy a hash.
+The exact algorithm is specified by passing a string to av_hash_alloc(),
+and the list of strings can be retrieved from av_hash_names().
+Algorithm-specific internal context is stored in AVHashContext::ctx.
+
+AVCodecContext is a complex context representing data about an encoding or
+decoding session.  @ref avcodec.h "Its associated functions" provide an
+abstract interface to encode and decode data.  Like most widely-used contexts,
+its first member is an AVClass pointer containing instance-specific information.
+That means it's an *AVClass context structure* (discussed below).
+
+X264Context contains the internal context for an AVCodecContext that uses
+the x264 library.  @ref libx264.c "Its associated functions" provide a concrete
+implementation for interacting with libx264.  This class is not directly
+accessible through FFmpeg's public interface, so it's easier to change
+X264Context than to change AVCodecContext between releases.
+
+## Reflection with AVClass and AVOptions
+
+AVClass is a struct containing general information about a context.
+It's a generic version of `ModifyThenPrintDialect` in the example.
+An *AVClass context structure* is a context whose first member
+is an AVClass that reflects the context's interface.
+
+AVOption is a struct containing information about a member of a context,
+or of an internal context.  `replace_this` and `with_this` in the example
+would echa have an associated AVOption value stored in the context's AVClass.
+An *@ref avoptions "AVOptions"-enabled struct* is a structure which reflects
+its contents using the @ref avoptions "AVOptions" API.
+
+The terms *AVClass context structure* and *@ref avoptions "AVOptions"-enabled
+struct* have become synonymous in modern usage, although you might notice some
+distinction when looking at code written after AVClass was developed but
+before @ref avoptions "AVOptions" was added.
+
+Even though the name "AVClass" implies an analogy to an object-oriented
+base [class](https://en.wikipedia.org/wiki/Class_(computer_programming)), they
+behave more like [C++ concepts](https://en.wikipedia.org/wiki/Concepts_(C%2B%2B))
+or [Java interfaces](https://en.wikipedia.org/wiki/Interface_(Java)).  Unlike a
+base class, an AVClass doesn't imply any theoretical relationship between objects,
+and contexts of the same type will often have different AVClass values.  It's even
+theoretically possible for a single AVClass to be shared between contexts of
+different types.
+
+To understand how AVClass and @ref avoptions "AVOptions" work,
+consider the requirements for a `libx264` encoder:
+
+- it has to support common encoder options like "bitrate"
+- it has to support encoder-specific options like "profile"
+  - the exact options could change quickly if a legal ruling forces a change of backend
+- it must not expose implementation details (e.g. its encoding buffer)
+  - otherwise, trivial improvements would force an increase of the major API version
+- it has to provide useful feedback to users about unsupported options
+
+Common encoder options like "bitrate" need to be stored in AVCodecContext itself
+to avoid duplicating code, while encoder-specific options like "profile" have to
+be stored in the X264Context instance stored in AVCodecContext::priv_data.
+But both need to be presented to users (along with help text, default values etc.)
+in a way that makes it practical to build user interfaces to get and set those options.
+
+A context's AVClass contains a list of AVOption members, describing the
+user-configurable options in the class.  It also contains a tree of further AVClass
+members that represent internal context either for the current object or for any
+object of the same type that could potentially exist in the current version of FFmpeg.
+
+The @ref avoptions "AVOptions" API accepts any AVClass context structure,
+looks through its AVOption data, and uses that to examine, introspect, and modify
+the structure.  Because the API doesn't contain any type-specific information,
+it can be used to create a general user interface that adapts automatically
+when e.g. a new version of FFmpeg adds a new configuration option.
+
+## Summary
+
+FFmpeg uses "contexts" in ways that often resemble object-oriented programming.
+But it focuses less on encapsulation within arbitrarily complex systems,
+and more on providing reflectivity to make good user interfaces.
-- 
2.43.0



More information about the ffmpeg-devel mailing list