[FFmpeg-user] optimal exact segmentation without re-encoding

Thu Jul 6 05:50:39 EEST 2023

El mié, 5 jul 2023 a las 22:17, David Bernat (<david.bernat at gmail.com>)
escribió:

> On Wed, Jul 5, 2023 at 9:39 PM Carl Zwanzig <cpz at tuunq.com> wrote:
>
> > On 7/5/2023 3:29 PM, David Bernat wrote:
> > > Premise: segment a mov file into about one second segments without
> > > re-encoding; yet preserving concatenation; such that the segmentation
> is
> > > embarrassingly parallel, for high-speed segmenting.
> >
> > What is the purpose of these segmented files (how will they be used)? Are
> > they going to be processed and then assembled together (which may involve
> > decode/recode operations). Depending on the unknown intermediate work, it
> > may be faster overall to decode all into an uncompressed state and work
> on
> > that before concatenating into a reencode pass (which could also be done
> > in
> > segments which are then themselves concatenated).
> >
>
> The parallelization of processing is an essential motivation. (Thank you
> and looking forward to forwarding this discussion and solution with you.)
>
> Multiple uses but may we focus on two:
>
> 1. computer vision and thumbnails:
>
> Object detection runs quickly on images; and, in most cases, it is
> sufficient to summarize a long video by applying object detection every 1
> or 2 seconds.
> Extracting a frame using -ss is very fast but not immediate: my speed tests
> indicate that a CPU can achieve about 60 image extractions in about 15
> seconds.
> ffmpeg -ss can be embarrassingly parallelized and does speed up from about
> 18 seconds to 15 seconds.
> The precise timestamp of the image is not essential (within a few frames is
> more than sufficient) and so ffmpeg -ss is sufficient.
> Accelerating this even further is a wonderful benefit.
> Furthermore, if the images are already segmented, each segment can be
> handled separately on a different cloud unit or CPU.
> In this case, motivating a storage option that is segmented is the key.
> Notice that this method does not strictly require that recomposition is
> achievable.
>
> 2. storage and generic processing:
>
> This adds the additional requirement that recomposition is achievable.
> Numerous use cases apply the parallelization scheme described above.
> It would be hugely beneficial for the storage efficiencies gained from the
> above also constitute the full storage of the video.
> In an extreme case, you can imagine a video player, in which the one second
> video clips are also serving a make-shift streaming solution.
> Whether this is the best example of that use case the concept is sufficient
> and additional use cases are welcomed.
>
>
> > Also, please post some of the commands you've used.
> >
>
> Here is an overview of the processes I am using at the moment:
>
> *This series of commands identifies the KEYFRAMES.*
>
> # cmd = "ffprobe -loglevel error -select_streams v:0 -show_entries
> packet=pts_time,flags -of csv=print_section=0 IMG_9209.MOV | awk -F','
> '/K/ {print $1}'"
> # result = subprocess.run(cmd, shell=True, cwd=os.getcwd(),
> capture_output=True)
>
> # key_frames = [float(t) for t in result.stdout.decode().split()] ==>
> [0.000000, 1.068333, 2.135000, 3.211667, 4.295000]
>
> This command creates SEGMENTs from KEYFRAME to KEYFRAME but is not
> precisely exact, and hence fails the concatenation requirement.
>
> # cmd = "ffmpeg -y -ss {key_frames[i]} -to {key_frames[i+1]} -i
> IMG_9209.MOV -c copy segment_{i}.move"
>
> # The output of SEEK will have ffprobe results at timestamps like this
> (with K being keyframe and D being negative):
> # [0.000000,K_ -0.066667,_D -0.100000,_D -0.033333,_D 0.135000,__
> 0.066667,__ 0.033333,__ 0.100000,__ 0.285000,__ ...
> # ... 0.826667,__ 0.910000,__ 1.076667,K_ 1.035000,__ 0.993333,__
> 1.243333,__ 1.160000,__]
>
> # Notice that 2.135000 - 1.068333 = 1.076667 as this is segment_2.mov;
> and timestamps are reset to start at zero.
>
> This command TRIMs each SEGMENT and does successfully fulfill the
> concatenation requirement; but is very, very slow (0.25x real time)
>
> # cmd = "ffmpeg -i segment_2.mov -vf trim=0:1.076667 segment_2_trim.mov"
> #
> # The output of TRIM takes 4-5 seconds despite being a 1.07 second
> clip: re-encoding is always slow. But, it is correct.
> # [0.000000,K_ 0.033333,__ 0.016667,__ 0.008333,__ 0.025000,__
> 0.066667,__ 0.050000,__ 0.041667,__ ...
> # ... 1.016667,__ 1.058333,__ 1.041667,__ 1.033333,__ 1.050000,__]
>
> This python code launches CONCAT:
>
> # s = time.time()
> # tmp = tempfile.NamedTemporaryFile(delete=False)
> # relpath = os.path.relpath(os.getcwd(), tmp.name)
> # with open(tmp.name, "w") as f:
> #     [f.write(f"file '{relpath}/segment_{i}.mov'\n") for i in range(28)]
> # cmd = f"ffmpeg -y -f concat -safe 0 -i {tmp.name} -c copy
> concatenated.mov"
> # subprocess.run(cmd.split(), shell=False, cwd=os.getcwd())
> # print(time.time()-s)
>
> [this next one does not work: it is intended to create keyframes every
> one second. though this entire avenue may be unnecessary.]
>
> # cmd = f"ffmpeg -y -i IMG_9209.MOV -force_key_frames
> expr:gte(t,n_forced*1) -c copy keyframes.mov"
>
>
> Thank you.
> DB
>
>
>
>
> Use ffmpeg segment options, looking for keyframes manually is very
inneficient:
https://ffmpeg.org/ffmpeg-formats.html#segment_002c-stream_005fsegment_002c-ssegment

>
>
>
> > z!
> > _______________________________________________
> > ffmpeg-user mailing list
> > ffmpeg-user at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> >
> _______________________________________________
> ffmpeg-user mailing list
> ffmpeg-user at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
>