[FFmpeg-devel] [PATCH] [2/??] [3/3] Filter graphs - Parser for a graph description

Tue Mar 18 10:29:51 CET 2008

Hello.

	I have some experience in my real life with representing graphs, and  
believe I can help here (though I am not yet an expert on ffmpeg --  
I'm interested in libavfilter, I'm learning from there).

	There are essentially two ways to represent edges out of a graph's  
node, the problem here stems from the fact that you're trying to use  
both at a time: you can give them explicit names (as in (tmp), [tmp]  
or [+tmp]), or implicitly consider them numbered (1st, 2nd, 3rd, ...).  
As a filter's pads are already numbered in AVFilter, I suggest the  
second option is the best here, more intuitive and much simpler. (The  
problem with names is that you pay the greater syntax flexibility with  
a lot of special cases and meaningless forms to handle.)

	Vitor is right that you need two graph combinators (in fact you might  
need a third one and few more symbols; more below), say , and * (the  
choice is immaterial, of course -- traditionally ; would be chosen  
instead of , and some tensor product symbol instead of *). The basic  
rules are:

(A)	filt1,filt2	means 	---> filt1 ---> filt2 --->

and is only syntactically well formed when the number of filt1's  
output streams equals the number of filt2's input streams; the result  
is a filter with filt1's input streams and filt2's output streams;  
filt1's output streams are fed to filt2's input stream as expected:  
1st to 1st, 2nd to 2nd, etc. All as expected.

(B) filt1 * filt2	means	

---> filt1 --->
---> filt2 --->

and is always well formed; the result is a filter with input/output  
filters of both filt1 and filt2 numbered starting from filt1.

It is easy to make the entire filter system work smoothly based on  
these rules only. I proceed with the details, just in case I managed  
to interest you.

To make this work, we need a few elementary filters which don't do any  
processing, just rearrange streams:

(C) split : 1 ---> 2	(by which I mean takes one input stream, produces  
two output streams) to duplicate a stream;

(D) swap: 2 ---> 2 to swap the order of streams (1st in input becomes  
second in output and vice versa);

(E) nop: 1 ---> 1 do nothing,  kill: 1 --> 0 drop stream

(it's worth noticing that if one wishes to use split_n : 1 ---> n to  
mean duplicate the input stream n times, then kill = split_0 and nop =  
split_1)

For instance (assuming as usual that * binds tighter than , ):

	split, nop*rotate, picInPic

would be a filter 1 ---> 1

	nop * (movie_src=test.avi, vflip), overlay

would be a filter 1 ---> 1 (not that nop is fundamental in both  
examples)

etc.

(F) If I understood your examples correctly (which I might have not)`,  
there is still one missing ingredient: a feedback combinator, say !,  
which takes a filter with at least one input and at least one output  
stream and feeds back the first output to the first input. For  
instance, let's look at the examples you have been discussing in this  
thread:

(1)
> in --> scale --> crop --> picInPic --> rotate --> split --> out
>                             ^                      |
>                             |                      |
>                           delay<-------------------/

becomes: scale,crop, !( picInPic, rotate, split, delay * nop ) -- in  
fact the duplicated stream from split is fed to delay and delay's  
output is then fed back to picInPic (if that was the sense of the  
example, of course...)

(2) Similarly,

>  in --> crop --> picInPic --> rotate --> split --> vflip --> out
>                     ^                      |
>                     |                      |
>                   delay<---- hflip --------/

becomes: crop, !( picInPic, rotate, split, (hflip, delay) * vflip )

(3)

> filter1->(abc)
> (def)->filter2
>
becomes: filter1 * filter2

> filter1->filter2
>  |         ^
>  v         |
> (abc)     (def)

becomes: filter1, split, (filter2 * nop)  (note of course that you  
can't add input streams, the number of a filter's required input/ 
output streams is determined by the filter you can't change it like  
that, only duplicate, )

(4)

> [tmp]myfilter="crop=123:345,scale=333:555,[tmp]picInPic"
> -vf mirror,[anotherin]myfilter

becomes: myfilter="split, (crop,scale) * nop, picInPic" and -vf  
mirror,myfilter

(5)

> movie_src=test.avi, vflip, (in)overlay(out)

becomes nop*(movie_src=test.avi, vflip), overlay

Finally, do you see for instance that with the (in) and (out) in the  
syntax you can really express to which input between (in) and test.avi  
is the first one? This may be irrelevant for overlay, but certainly it  
is for other filters, eg picInPic. With the syntax I propose, the two  
versions would be:

	nop*(movie_src=test.avi, vflip), picInPic

which differs from

	nop*(movie_src=test.avi, vflip), swap, picInPic

which is the same as

	(movie_src=test.avi, vflip)*nop, picInPic

The latter refers to the fact that all this is mathematically very  
solid and well defined, governed by a small set of equations (eg swap,  
vflip*hflip  = hflip * vflip, swap).

I find this syntax clearly superior (there are only two well- 
formedness rules, no clumsy handling of names and aliases and  
danglers). If there is interest for this, I will elaborate.

Regards.