[FFmpeg-devel] [PATCH] Complete rewrite of the "fps" video filter section. More accurate.

Fri May 1 00:09:13 EEST 2020

On 2020-04-28 08:50, Carl Eugen Hoyos wrote:

> Am Mo., 27. Apr. 2020 um 08:18 Uhr schrieb <list+ffmpeg-dev at jdlh.com>:
>
> The suggested patch is currently not ok.
Thank you for your review, Carl Eugen. The patch will be the better for 
your comments.

My responses interleaved.

…[snip]…

>> -Convert the video to specified constant frame rate by duplicating or dropping
>> -frames as necessary.
>> +Generate a video, having the specified constant frame rate, from the frames of
> I am not saying that "convert" is ideal but "Generate a video" is simply wrong.

I will look at all the word choices for verbs and see if I can make this 
clearer.

The word "generate" was a response to the the original text using the 
word "convert". "Convert" gave me the impression that the filter might 
just change time base and frame rate metadata parameters, but leave the 
rest unchanged. The *fps* filter does more than that. It alters the PTS 
values. It can add frames. It can drop frames. To my reading the output 
video is a new thing, not the original thing with some tweaks.

>> +the input video, by copying or duplicating or dropping input frames based on
> What is the difference between "copying" and "duplicating"?

Good point. I will look harder at these word choices.

What I am trying to convey is that the filter can do one of three things 
to the input frames: pass them through, or drop some of them, or repeat 
some of them. I will look for words which communicate this three-way path.

>> +their input presentation time stamps (PTSs). The output video has new PTSs. You
>> +can choose the method for rounding from input PTS to output PTS.
>>
>>   It accepts the following parameters:
>>   @table @option
>>
>>   @item fps
>> -The desired output frame rate. The default is @code{25}.
>> +The output frame rate, as a number of frames per second. This value can be an
>> +integer, real, or rational number, or an abbreviation. The default is @code{25}.
>>
>>   @item start_time
>> -Assume the first PTS should be the given value, in seconds. This allows for
>> -padding/trimming at the start of stream. By default, no assumption is made
>> -about the first frame's expected PTS, so no padding or trimming is done.
>> -For example, this could be set to 0 to pad the beginning with duplicates of
>> -the first frame if a video stream starts after the audio stream or to trim any
>> -frames with a negative PTS.
>> +The time, in seconds from the start of the input stream, which is converted to
>> +an input starting PTS value and an output starting PTS value.
>> +If set, @var{fps} drops input frames
>> +which have PTS values less than the input starting PTS. If not set, the input
>> +and output starting PTS values are zero, but @var{fps} drops no input frames based
>> +on PTS.
> This is different than the explanation above and (hopefully) wrong.

I based my new text on reading the code. Take a look for yourself: 
ffmpeg/libavfilter/vf_fps.c:159-174 
<https://github.com/FFmpeg/FFmpeg/blob/1128aa875367f66ac11adc30364d5652919a2591/libavfilter/vf_fps.c#L159-L174>, 
within config_props(), and ffmpeg/libavfilter/vf_fps.c:195-204 
<https://github.com/FFmpeg/FFmpeg/blob/1128aa875367f66ac11adc30364d5652919a2591/libavfilter/vf_fps.c#L195-L204>, 
within read_frame().

I think the first sentence of the old text, "Assume the first PTS should 
be the given value", doesn't match what the code does. The code does not 
include a clear high-level statement of what the parameter does. I think 
that means we have to take the implementation as correct, and document 
what it actually does.

The rest of the old text is actually a guide on how to use the 
`start_time` parameter. I kept that, but moved it down to after the 
parameter definition table.

>> +(See details below.)
>>
>>   @item round
>> -Timestamp (PTS) rounding method.
>> +Rounding method to use when calculating output Presentation Timestamp
>> +(PTS) integer values from input PTS values. If the calculated output PTS value
>> +is not exactly an integer, then the method determines which of the two
>> +neighbouring integer values to choose.
> I do not see any improvement with this change.

The old text fails to answer these questions: what is being rounded? 
What is the calculation which leads to the result being rounded? What is 
the frame of reference for "towards 0", "towards infinity"? What 
possible values does the rounding choose among? The new text tries to 
answer those questions.

There may be a better way to answer these questions. I will try to think 
of better wording. But I think the documentation should provide answers.

>> -Possible values are:
>> +Possible method names are:
> Not being a native speaker, the change makes this harder to read.
> ("Verschlimmbesserung")
As I replied to Josh, I think this is a wording style point.  This 
section uses the word "value" to indicate numerical quantities, and the 
strings passed to the `round` parameter are not numerical. It seems 
clearer to use a different word, and to tie it to "method". But if this 
is all that stands in the way of getting the patch accepted, I will concede.
>>   @table @option
>>   @item zero
>>   round towards 0
>> @@ -11170,43 +11177,167 @@ round towards -infinity
>>   @item up
>>   round towards +infinity
>>   @item near
>> -round to nearest
>> +round to nearest (and if exactly at midpoint, away from 0)
> I wonder if this implementation detail should be documented.
> Are you sure that the current behaviour is intended and must
> not change?

Again, the code does not provide a high-level statement of what it 
intends. It does not separate essence of interface from accident of 
implementation. However, one clue is that the `near` value for `round` 
corresponds to AV_ROUND_NEAR_INF, which is defined in 
ffmpeg/libavutil/mathematics.c:84 
<https://github.com/FFmpeg/FFmpeg/blob/a0ac49e38ee1d1011c394d7be67d0f08b2281526/libavutil/mathematics.h#L84>, 
where it says:

AV_ROUND_NEAR_INF = 5, ///< Round to nearest and halfway cases away from zero.

So, is that comment in *mathematics.c* an interface specification?  Is 
the use of that value in *vf_fps.c* an interface specification? It sure 
would be nice if the code had a high-level statement of intent. But it 
seems not to.

I don't feel strongly about this point. If you all tell me to leave this 
halfway case behaviour out of the documentation, I can live with that.

>>   @end table
>>   The default is @code{near}.
>>
>>   @item eof_action
>> -Action performed when reading the last frame.
>> +Action which @var{fps} takes with the final input frame.
> "Verschlimmbesserung", see above.

The new text makes two changes to the old:

1. The new text is clear that the last /input/ frame, not output frame, 
is what the parameter controls. The old text is ambiguous.

2. The new text uses the active voice ("@var{fps} takes"). The old text 
uses "when" plus a present progressive clause. The style guides I 
remember say that active voice is more lively than indirect 
constructions. In any case, it looks like the old text should have used 
"while" instead of "when" 
<https://english.stackexchange.com/a/17318/33109>.  (Thank you for 
leading me to a discussion of subject complements, by the way.)

I care about #1, because I want the documentation to say what the filter 
does. I am flexible on #2.

>
>> The input video passes
>> +in an ending input PTS, which @var{fps} converts to an ending output PTS.
>> + at var{fps} drops any input frames with a PTS at or after this ending PTS.
> What is this supposed to mean / isn't this wrong?

I guess the wording is unclear. The code is even more unclear.

Take a look: ffmpeg/libavfilter/vf_fps.c:234-245 
<https://github.com/FFmpeg/FFmpeg/blob/1128aa875367f66ac11adc30364d5652919a2591/libavfilter/vf_fps.c#L234-L245>, 
within write_frame(). There is a comment, "If we have status (EOF) set, 
drop frames when we hit the status timestamp." I have a hard time 
relating it to what the code does.

If you can clarify what the code a) actually does, and b) is supposed to 
do, I am glad to try to describe that more clearly.

>>   Possible values are:
>>   @table @option
>>   @item round
>> -Use same timestamp rounding method as used for other frames.
>> +Use same rounding method as for other frames, to convert the ending input PTS
>> +to output PTS.
> Useless change.

"Useless"? No, it restates for clarity what the rounding operates on.

However, "not very useful, and not worth the cost in extra words"?  
Maybe so. I will take another look at these words when I edit to reduce 
word count.

>> +
>>   @item pass
>> -Pass through last frame if input duration has not been reached yet.
>> +Round the ending input PTS using @code{up}. This can have the effect of passing
>> +through one last input frame.
>>   @end table
>> +
>>   The default is @code{round}.
>>
>>   @end table
>>
>> -Alternatively, the options can be specified as a flat string:
>> +Alternatively, the options may be specified as a flat string:
>>   @var{fps}[:@var{start_time}[:@var{round}]].
> The remaining part is far too long and not acceptable.

By "the remaining part" I understand you to mean the remaining three 
paragraphs in the main section, and the entire subsection "Details". I 
presume you still want to keep the "Examples" subsection.

I think what I put in those paragraphs is all true. It all accurately 
describes the code. It all explains points which I found to be necessary 
to understand as I was trying, as a user, to figure out if the *fps* 
filter would accomplish a purpose I wanted to do.

> Allow me to repeat: The patch is currently not ok.

Your earlier comments are about specific points. Some I agree with, some 
I disagree with. But I can imagine working with you to resolve them.

Your final comment, "The remaining part is far too long and not 
acceptable", comes across as a rejection of the very idea of improving 
*FFmpeg* documentation.

*FFmpeg* is a complex tool. Even apparently simple filters are complex. 
The present documentation is inadequate to explain to a user how the 
code behaves. This presents a big obstacle for *FFmpeg* users. Traffic 
on the ffmpeg-users list proves this constantly.

What do you have against removing this obstacle? What do you have 
against documentation which describes what the code actually does?

I will go through all these comments and make a revised patch. Let's see 
how that looks.

> Carl Eugen
Best regards,

      —Jim DeLaHunt, software engineer, Vancouver, Canada