[FFmpeg-devel] avformat_seek_file in H265 video seeks beyond the max_ts passed in?
ahmadsharif at gmail.com
Mon Aug 12 23:24:53 EEST 2024
My understanding is that avformat_seek_file() with these parameters:
avformat_seek_file(format_context, 0, INT64_MIN, timestamp, timestamp)
should seek in the video to an I-Frame that is strictly <= timestamp
(because ts=timestamp and max_ts=timestamp).
However, the observed behavior that I see is that for certain H265 videos,
FFMPEG seeks beyond the timestamp passed in. To repro this behavior I ran
these commands:
# Create a clean conda environment for testing purposes
conda create --name test
conda activate test
conda install -c conda-forge x265
# Install some build pre-requisites
conda install pkg-config
# Build ffmpeg from source with x265 enabled
git clone https://github.com/FFmpeg/FFmpeg.git
./configure --enable-nonfree --enable-gpl --prefix=$(readlink -f ../bin)
--enable-libx265 --enable-rpath
--extra-ldflags=-Wl,-rpath=$CONDA_PREFIX/lib --enable-filter=drawtext
--enable-libfontconfig --enable-libfreetype --enable-libharfbuzz
make -j install
# Now generate a video with just frame numbers in the text per frame:
ffmpeg -f lavfi -i color=size=128x128:duration=1:rate=10:color=blue -vf
%{frame_num}'" -vcodec libx265 -pix_fmt yuv420p -g 2 -crf 10 test.mp4 -y
Note that this video has 10 frames. ffprobe shows the following:
ffprobe -v error -select_streams v:0 -show_entries
frame=pts,pts_time,duration,pkt_pts_time,pkt_duration,key_frame -of csv
Now, when I open this video using FFMPEG as a library, I get an
AVFormatContext. I want to decode the frame with pts=0.5. So I call
avformat_seek_file with min_ts=-INT64_MAX, ts=0.5 and max_ts=0.5.
I expect that FFMPEG will seek to the frame with pts=0.4 so I can then
decode forward and eventually get frame with pts=0.5 with
avcodec_receive_frame(), but it seems like the first frame that I get from
avcodec_receive_frame() is the one with pts=0.6.
More context:
I am writing a library that wraps FFMPEG and returns frames at arbitrary
timestamps. The full source code of the library is here:
https://github.com/pytorch/torchcodec. The pull-request that reproduces
this exact scenario is here: https://github.com/pytorch/torchcodec/pull/178.
It would be nice if FFMPEG always seeked to a frame with pts <= the max_pts
passed into avformat_seek_file. This normally does work with other codecs.
Am I calling the library wrong? Should I be calling avformat_seek_file()
with other flags? The documentation of avformat_seek_file is here:
Here is the seek call in my code:
I would be happy to file a ticket as well, if that helps. The full repro
instructions are in this email for reference.
More information about the ffmpeg-devel
mailing list