[FFmpeg-user] optimal exact segmentation without re-encoding

David Bernat david.bernat at gmail.com
Fri Jul 7 00:42:35 EEST 2023


On Thu, Jul 6, 2023 at 4:02 PM Eduardo Alarcón <ealarcong at gmail.com> wrote:

> El jue, 6 jul 2023 a las 12:16, David Bernat (<david.bernat at gmail.com>)
> escribió:
>
> > At the risk of being rude, Eduardo Alarcón, (and me top-commenting),
> please
> > do not thread-hijack.
> >
> How is this thread-hijack? do you remember this is a mailing list, right? i
> have done something similar to what you want to do, so i'm telling you what
> i did to help.
>
>
> Carl Zwanzig and I are having a very careful conversation and I wish to
> > continue with him.
> > The use of the SEGMENT command requires re-encoding, in my experience,
> even
> > though each segment can be set to go from keyframe to keyframe.
> > If this is incorrect, I look to Carl Zwanzig for more information at this
> > time, commiserate with my detailed reply. Thank you both.
> >
>
> As per example on the documentation, it doesn't require re-encoding :
>
> Segment the input file, and create an M3U8 live playlist (can be used as
> live HLS source):
> ffmpeg -re -i in.mkv -codec copy -map 0 -f segment -segment_list
> playlist.m3u8 -segment_list_flags +live -segment_time 10 out%03d.mkv
>
> You can use -codec copy and do not need to reencode, ffmpeg will try find
> the closet keyframe to the segment time you defined without re-encoding, as
> you asked.
>
>
You are correct, and I am wrong. This segmentation method does seem to do
the similar segmentation
as my use of copy at key_frames, and is much faster.

It does, however, still possess extra frames when concatenation. If this
problem is not immediately
apparent I will post code and the warning messages associated to it later
in the evening.

Thank you for your response. It was valuable, and not thread-hackingl.

>
> >
> > > On Wed, Jul 5, 2023 at 9:39 PM Carl Zwanzig <cpz at tuunq.com> wrote:
> > > >
> > > > > On 7/5/2023 3:29 PM, David Bernat wrote:
> > > > > > Premise: segment a mov file into about one second segments
> without
> > > > > > re-encoding; yet preserving concatenation; such that the
> > segmentation
> > > > is
> > > > > > embarrassingly parallel, for high-speed segmenting.
> > > > >
> > > > > What is the purpose of these segmented files (how will they be
> used)?
> > > Are
> > > > > they going to be processed and then assembled together (which may
> > > involve
> > > > > decode/recode operations). Depending on the unknown intermediate
> > work,
> > > it
> > > > > may be faster overall to decode all into an uncompressed state and
> > work
> > > > on
> > > > > that before concatenating into a reencode pass (which could also be
> > > done
> > > > > in
> > > > > segments which are then themselves concatenated).
> > > > >
> > > >
> > > > The parallelization of processing is an essential motivation. (Thank
> > you
> > > > and looking forward to forwarding this discussion and solution with
> > you.)
> > > >
> > > > Multiple uses but may we focus on two:
> > > >
> > > > 1. computer vision and thumbnails:
> > > >
> > > > Object detection runs quickly on images; and, in most cases, it is
> > > > sufficient to summarize a long video by applying object detection
> > every 1
> > > > or 2 seconds.
> > > > Extracting a frame using -ss is very fast but not immediate: my speed
> > > tests
> > > > indicate that a CPU can achieve about 60 image extractions in about
> 15
> > > > seconds.
> > > > ffmpeg -ss can be embarrassingly parallelized and does speed up from
> > > about
> > > > 18 seconds to 15 seconds.
> > > > The precise timestamp of the image is not essential (within a few
> > frames
> > > is
> > > > more than sufficient) and so ffmpeg -ss is sufficient.
> > > > Accelerating this even further is a wonderful benefit.
> > > > Furthermore, if the images are already segmented, each segment can be
> > > > handled separately on a different cloud unit or CPU.
> > > > In this case, motivating a storage option that is segmented is the
> key.
> > > > Notice that this method does not strictly require that recomposition
> is
> > > > achievable.
> > > >
> > > > 2. storage and generic processing:
> > > >
> > > > This adds the additional requirement that recomposition is
> achievable.
> > > > Numerous use cases apply the parallelization scheme described above.
> > > > It would be hugely beneficial for the storage efficiencies gained
> from
> > > the
> > > > above also constitute the full storage of the video.
> > > > In an extreme case, you can imagine a video player, in which the one
> > > second
> > > > video clips are also serving a make-shift streaming solution.
> > > > Whether this is the best example of that use case the concept is
> > > sufficient
> > > > and additional use cases are welcomed.
> > > >
> > > >
> > > > > Also, please post some of the commands you've used.
> > > > >
> > > >
> > > > Here is an overview of the processes I am using at the moment:
> > > >
> > > > *This series of commands identifies the KEYFRAMES.*
> > > >
> > > > # cmd = "ffprobe -loglevel error -select_streams v:0 -show_entries
> > > > packet=pts_time,flags -of csv=print_section=0 IMG_9209.MOV | awk
> -F','
> > > > '/K/ {print $1}'"
> > > > # result = subprocess.run(cmd, shell=True, cwd=os.getcwd(),
> > > > capture_output=True)
> > > >
> > > > # key_frames = [float(t) for t in result.stdout.decode().split()] ==>
> > > > [0.000000, 1.068333, 2.135000, 3.211667, 4.295000]
> > > >
> > > > This command creates SEGMENTs from KEYFRAME to KEYFRAME but is not
> > > > precisely exact, and hence fails the concatenation requirement.
> > > >
> > > > # cmd = "ffmpeg -y -ss {key_frames[i]} -to {key_frames[i+1]} -i
> > > > IMG_9209.MOV -c copy segment_{i}.move"
> > > >
> > > > # The output of SEEK will have ffprobe results at timestamps like
> this
> > > > (with K being keyframe and D being negative):
> > > > # [0.000000,K_ -0.066667,_D -0.100000,_D -0.033333,_D 0.135000,__
> > > > 0.066667,__ 0.033333,__ 0.100000,__ 0.285000,__ ...
> > > > # ... 0.826667,__ 0.910000,__ 1.076667,K_ 1.035000,__ 0.993333,__
> > > > 1.243333,__ 1.160000,__]
> > > >
> > > > # Notice that 2.135000 - 1.068333 = 1.076667 as this is
> segment_2.mov;
> > > > and timestamps are reset to start at zero.
> > > >
> > > > This command TRIMs each SEGMENT and does successfully fulfill the
> > > > concatenation requirement; but is very, very slow (0.25x real time)
> > > >
> > > > # cmd = "ffmpeg -i segment_2.mov -vf trim=0:1.076667
> > segment_2_trim.mov"
> > > > #
> > > > # The output of TRIM takes 4-5 seconds despite being a 1.07 second
> > > > clip: re-encoding is always slow. But, it is correct.
> > > > # [0.000000,K_ 0.033333,__ 0.016667,__ 0.008333,__ 0.025000,__
> > > > 0.066667,__ 0.050000,__ 0.041667,__ ...
> > > > # ... 1.016667,__ 1.058333,__ 1.041667,__ 1.033333,__ 1.050000,__]
> > > >
> > > > This python code launches CONCAT:
> > > >
> > > > # s = time.time()
> > > > # tmp = tempfile.NamedTemporaryFile(delete=False)
> > > > # relpath = os.path.relpath(os.getcwd(), tmp.name)
> > > > # with open(tmp.name, "w") as f:
> > > > #     [f.write(f"file '{relpath}/segment_{i}.mov'\n") for i in
> > range(28)]
> > > > # cmd = f"ffmpeg -y -f concat -safe 0 -i {tmp.name} -c copy
> > > > concatenated.mov"
> > > > # subprocess.run(cmd.split(), shell=False, cwd=os.getcwd())
> > > > # print(time.time()-s)
> > > >
> > > > [this next one does not work: it is intended to create keyframes
> every
> > > > one second. though this entire avenue may be unnecessary.]
> > > >
> > > > # cmd = f"ffmpeg -y -i IMG_9209.MOV -force_key_frames
> > > > expr:gte(t,n_forced*1) -c copy keyframes.mov"
> > > >
> > > >
> > > > Thank you.
> > > > DB
> > > >
> > > >
> > >
> > > >
> > > >
> > > >
> > > > > z!
> > > > > _______________________________________________
> > > > > ffmpeg-user mailing list
> > > > > ffmpeg-user at ffmpeg.org
> > > > > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> > > > >
> > > > > To unsubscribe, visit link above, or email
> > > > > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> > > > >
> > > > _______________________________________________
> > > > ffmpeg-user mailing list
> > > > ffmpeg-user at ffmpeg.org
> > > > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> > > >
> > > > To unsubscribe, visit link above, or email
> > > > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> > > >
> > > _______________________________________________
> > > ffmpeg-user mailing list
> > > ffmpeg-user at ffmpeg.org
> > > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> > >
> > > To unsubscribe, visit link above, or email
> > > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> > >
> > _______________________________________________
> > ffmpeg-user mailing list
> > ffmpeg-user at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> >
> _______________________________________________
> ffmpeg-user mailing list
> ffmpeg-user at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
>


More information about the ffmpeg-user mailing list