[FFmpeg-devel] [GSoC] Motion Interpolation

Davinder Singh ds.mudhar at gmail.com
Wed Jun 1 00:43:38 CEST 2016

There’s a lot of research done on Motion Estimation. Depending upon the
intended application of the resultant motion vectors, the method used for
motion estimation can be very different.

Classification of Motion Estimation Methods:

Direct Methods: In direct methods we calculate optical flow
<https://en.wikipedia.org/wiki/Optical_flow> in the scene.

- Phase Correlation

- Block Matching

- Spatio-Temporal Gradient

 - Optical flow: Uses optical flow equation to find motion in the scene.

 - Pel-recursive: Also compute optical flow, but in such a way that allow
recursive computability on vector fields)

Indirect Methods

- Feature based Method: Find features in the frame, and used for estimation.

Here are some papers on Frame Rate Up-Conversion (FRUC):

Phase Correlation:

This method relies on frequency-domain representation of data, calculated
using fast Fourier transform.
<https://en.wikipedia.org/wiki/Fast_Fourier_transform> Phase Correlation
provides a correlation surface from the comparison of images. This enables
the identification of motion on a pixel-by-pixel basis for correct
processing of each motion type. Since phase correlation operates in the
frequency rather than the spatial domain, it is able to zero in on details
while ignoring such factors as noise and grain within the picture. In other
words, the system is highly tolerant of the noise variations and rapid
changes in luminance levels that are found in many types of content –
resulting in high-quality performance on fades, objects moving in and out
of shade, and light ashes.


[1] "Disney Research » Phase-Based Frame Interpolation for Video." IEEE
CVPR 2015 <https://www.disneyresearch.com/publication/phasebased/>

[2] Yoo, DongGon et al. "Phase Correlated Bilateral Motion Estimation for
Frame Rate Up-Conversion." The 23rd International Technical Conference on
Circuits/Systems, Computers and Communications (ITC-CSCC Jul. 2008.


The video on paper [1] page demonstrate comparison between various methods.

Optical Flow:


[3] Brox et al. "High accuracy optical flow estimation based on a theory
for warping." Computer Vision - ECCV 2004: 25-36.


Slowmovideo <http://slowmovideo.granjow.net/> open-source project is based
on Optical flow equation.

Algorithm we can implement is based on block matching method.

Motion Compensated Frame Interpolation


[4] Zhai et al. "A low complexity motion compensated frame interpolation
method." IEEE ISCAS 2005: 4927-4930.


Block-based motion estimation and pixel-wise motion estimation are the two
main categories of motion estimation methods. In general, pixel-wise motion
estimation can attain accurate motion fields, but needs a substantial
amount of computation. In contrast, block matching algorithms (BMA) can be
efficiently implemented and provide good performance.

Most MCFI algorithms utilize the block-matching algorithm (BMA) for motion
estimation (ME). BMA is simple and easy to implement. It also generates a
compactly represented motion field. However, unlike video compression, it
is more important to find true motion trajectories in MCFI. The objective
of MC in MCFI is not to minimize the energy of MC residual signals, but to
reconstruct interpolated frames with better visual quality.

The algorithm uses motion vectors which are embedded in bit-stream. If
vectors exported by codec (using +export_mvs flag2) are used when
available, computation of the motion vectors will be significantly reduced
for realtime playback. Otherwise the mEstimate filter will generate MVs,
and to make the process faster, same algorithms (used by x264 and x265) -
Diamond, Hex, UMH, Star will be implemented in the filter. Other filter -
mInterpolate will use the MVs in the frame side data to interpolate frames
using various methods - OBMC (Overlapped block motion compensation), simple
frame blending and frame duplication etc.

However, MVs generated based on SAD or BAD might bring serious artifacts if
they are used directly. So, the algorithm first examines the motion vectors
and classify into two groups, one group with vectors which are considered
to represent “true” motion, other having “bad” vectors, then carries out
overlapped block bi-directional motion estimation on corresponding blocks
having “bad” MVs. Finally, it utilizes motion vector post-processing and
overlapped block motion compensation to generate interpolated frames and
further reduce blocking artifacts. Details on each step are in the paper

Paper 2:

[5] Choi et al. "Motion-compensated frame interpolation using bilateral
motion estimation and adaptive overlapped block motion compensation." Circuits
and Systems for Video Technology, IEEE Transactions 2007: 407-416.


Other Papers:

Bai et al. "Visual-weighted motion compensation frame interpolation with
motion vector refinement" Circuits and Systems (ISCAS), 2012 IEEE
International Symposium 2012: 500-503.


Park et al. "Motion compensated frame rate up-conversion using modified
adaptive extended bilateral motion estimation" Journal of Automation and
Control Engineering Vol 2.4 (2014).


Tsai et al. "Frame rate up-conversion using adaptive bilateral motion
estimation" WSEAS International Conference. Proceedings. Mathematics and
Computers in Science and Engineering: 2009.

Please share your thoughts on this.
Meanwhile I'm implementing fast ME methods (dia, hex, star) in mEstimate
and OBMC in mInterpolate.


More information about the ffmpeg-devel mailing list