[FFmpeg-user] The concat filter and duplicate frames from prores files

Nick Ludlam nick at recoil.org
Mon Aug 6 00:08:59 EEST 2018


Hi all,
I’ve got some puzzling behaviour when attempting to join a set of prores quicktime files together via the concat filter, and encode down to an mp4. 

Quicktimes produced by the video editing software we’re using cannot be successfully concatenated without producing duplicate frames. In a reduced case, I can demonstrate this happening when joining a video to itself three times. A duplicate frame is reliably inserted between the second and third section.

If we use Adobe Media Encode to “rewrap" the original prores files, then they are able to be concatenated correctly with no dupes.

I’ve got a capture of the session at https://gist.github.com/nickludlam/5a8d43f7d54d5f0b626c7b6d0eca7756 and the report of duplicate frames happens at line 133, but I’m also going to paste it here for convenience.

Is there a likely culprit for this? Something where the audio is fractionally longer than the video, somehow? Or timestamps are causing the concatenation process to behave in this way?  I would ultimately like to remove the dependency on AME in our pipeline, so I’m keen to understand how this is happening.

I’ve started to use ffprobe to have a look at frames and packets, but without an idea of what to look for, it’s a bit difficult to make sense of the data.

Thanks,
Nick



$ ffmpeg -loglevel verbose \
-i /Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov \
-i /Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov \
-i /Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov \
-t 1 -f lavfi -i anullsrc=r=48000:cl=stereo -pix_fmt yuv420p -filter_complex "[0:v] [0:a] [1:v] [1:a] [2:v] [2:a] concat=n=3:v=1:a=1[v][a]" -map "[v]" -map "[a]"  -preset fast -c:v libx264 -b:v 2000k -c:a aac -b:a 96k /tmp/output.mp4

ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers
  built with Apple LLVM version 9.1.0 (clang-902.0.39.2)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0.2 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libfreetype --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov':
  Metadata:
    major_brand     : qt
    minor_version   : 0
    compatible_brands: qt
    creation_time   : 2018-07-27T15:02:45.000000Z
  Duration: 00:00:05.68, start: 0.000000, bitrate: 187659 kb/s
    Stream #0:0(und): Video: prores, 1 reference frame (apch / 0x68637061), yuv422p10le(bt709, progressive), 1080x1920, 187329 kb/s, SAR 1:1 DAR 9:16, 25 fps, 25 tbr, 25 tbn, 25 tbc (default)
    Metadata:
      creation_time   : 2018-07-27T15:02:45.000000Z
      handler_name    : Core Media Data Handler
      encoder         : Apple ProRes 422 HQ
      timecode        : 07:47:36:03
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 320 kb/s (default)
    Metadata:
      creation_time   : 2018-07-27T15:02:45.000000Z
      handler_name    : Core Media Data Handler
    Stream #0:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
    Metadata:
      creation_time   : 2018-07-27T15:02:45.000000Z
      handler_name    : Core Media Data Handler
      timecode        : 07:47:36:03
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov':
  Metadata:
    major_brand     : qt
    minor_version   : 0
    compatible_brands: qt
    creation_time   : 2018-07-27T15:02:45.000000Z
  Duration: 00:00:05.68, start: 0.000000, bitrate: 187659 kb/s
    Stream #1:0(und): Video: prores, 1 reference frame (apch / 0x68637061), yuv422p10le(bt709, progressive), 1080x1920, 187329 kb/s, SAR 1:1 DAR 9:16, 25 fps, 25 tbr, 25 tbn, 25 tbc (default)
    Metadata:
      creation_time   : 2018-07-27T15:02:45.000000Z
      handler_name    : Core Media Data Handler
      encoder         : Apple ProRes 422 HQ
      timecode        : 07:47:36:03
    Stream #1:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 320 kb/s (default)
    Metadata:
      creation_time   : 2018-07-27T15:02:45.000000Z
      handler_name    : Core Media Data Handler
    Stream #1:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
    Metadata:
      creation_time   : 2018-07-27T15:02:45.000000Z
      handler_name    : Core Media Data Handler
      timecode        : 07:47:36:03
Input #2, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov':
  Metadata:
    major_brand     : qt
    minor_version   : 0
    compatible_brands: qt
    creation_time   : 2018-07-27T15:02:45.000000Z
  Duration: 00:00:05.68, start: 0.000000, bitrate: 187659 kb/s
    Stream #2:0(und): Video: prores, 1 reference frame (apch / 0x68637061), yuv422p10le(bt709, progressive), 1080x1920, 187329 kb/s, SAR 1:1 DAR 9:16, 25 fps, 25 tbr, 25 tbn, 25 tbc (default)
    Metadata:
      creation_time   : 2018-07-27T15:02:45.000000Z
      handler_name    : Core Media Data Handler
      encoder         : Apple ProRes 422 HQ
      timecode        : 07:47:36:03
    Stream #2:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 320 kb/s (default)
    Metadata:
      creation_time   : 2018-07-27T15:02:45.000000Z
      handler_name    : Core Media Data Handler
    Stream #2:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
    Metadata:
      creation_time   : 2018-07-27T15:02:45.000000Z
      handler_name    : Core Media Data Handler
      timecode        : 07:47:36:03
[Parsed_anullsrc_0 @ 0x7f8f3b50ffc0] sample_rate:48000 channel_layout:'stereo' nb_samples:1024
Input #3, lavfi, from 'anullsrc=r=48000:cl=stereo':
  Duration: N/A, start: 0.000000, bitrate: 768 kb/s
    Stream #3:0: Audio: pcm_u8, 48000 Hz, stereo, u8, 768 kb/s
File '/tmp/output.mp4' already exists. Overwrite ? [y/N] y
Stream mapping:
  Stream #0:0 (prores) -> concat:in0:v0
  Stream #0:1 (aac) -> concat:in0:a0
  Stream #1:0 (prores) -> concat:in1:v0
  Stream #1:1 (aac) -> concat:in1:a0
  Stream #2:0 (prores) -> concat:in2:v0
  Stream #2:1 (aac) -> concat:in2:a0
  concat:out:v0 -> Stream #0:0 (libx264)
  concat:out:a0 -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[graph 0 input from stream 0:0 @ 0x7f8f3b613cc0] w:1080 h:1920 pixfmt:yuv422p10le tb:1/25 fr:25/1 sar:1/1 sws_param:flags=2
[graph_0_in_0_1 @ 0x7f8f3b614380] tb:1/48000 samplefmt:fltp samplerate:48000 chlayout:0x3
[graph 0 input from stream 1:0 @ 0x7f8f3b614500] w:1080 h:1920 pixfmt:yuv422p10le tb:1/25 fr:25/1 sar:1/1 sws_param:flags=2
[graph_0_in_1_1 @ 0x7f8f3b614900] tb:1/48000 samplefmt:fltp samplerate:48000 chlayout:0x3
[graph 0 input from stream 2:0 @ 0x7f8f3b614d40] w:1080 h:1920 pixfmt:yuv422p10le tb:1/25 fr:25/1 sar:1/1 sws_param:flags=2
[graph_0_in_2_1 @ 0x7f8f3b615100] tb:1/48000 samplefmt:fltp samplerate:48000 chlayout:0x3
[auto_scaler_0 @ 0x7f8f3b616cc0] w:iw h:ih flags:'bilinear' interl:0
[format @ 0x7f8f3b616000] auto-inserting filter 'auto_scaler_0' between the filter 'Parsed_concat_0' and the filter 'format'
[auto_scaler_0 @ 0x7f8f3b616cc0] w:1080 h:1920 fmt:yuv422p10le sar:1/1 -> w:1080 h:1920 fmt:yuv420p sar:1/1 flags:0x2
[libx264 @ 0x7f8f3d01d600] using SAR=1/1
[libx264 @ 0x7f8f3d01d600] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x7f8f3d01d600] profile High, level 4.0
[libx264 @ 0x7f8f3d01d600] 264 - core 152 r2854 e9a5903 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=2 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=6 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=30 rc=abr mbtree=1 bitrate=2000 ratetol=1.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '/tmp/output.mp4':
  Metadata:
    major_brand     : qt
    minor_version   : 0
    compatible_brands: qt
    encoder         : Lavf58.12.100
    Stream #0:0: Video: h264 (libx264), 1 reference frame (avc1 / 0x31637661), yuv420p(progressive), 1080x1920 [SAR 1:1 DAR 9:16], q=-1--1, 2000 kb/s, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      encoder         : Lavc58.18.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/2000000 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, delay 1024, 96 kb/s (default)
    Metadata:
      encoder         : Lavc58.18.100 aac
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in0:v0, 1 streams left in segment.its/s speed=2.12x
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in0:a0, 0 streams left in segment.
[Parsed_concat_0 @ 0x7f8f3b6139c0] Segment finished at pts=5696000
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in1:v0, 1 streams left in segment.its/s speed=1.97x
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in1:a0, 0 streams left in segment.
[Parsed_concat_0 @ 0x7f8f3b6139c0] Segment finished at pts=11392000
*** 1 dup!
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in2:v0, 1 streams left in segment.its/s dup=1 drop=0 speed=1.91x
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in2:a0, 0 streams left in segment.
[Parsed_concat_0 @ 0x7f8f3b6139c0] Segment finished at pts=17088000
No more output streams to write to, finishing.
frame=  427 fps= 42 q=-1.0 Lsize=    4550kB time=00:00:17.08 bitrate=2181.0kbits/s dup=1 drop=0 speed= 1.7x
video:4328kB audio:208kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.309756%
Input file #0 (/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov):
  Input stream #0:0 (video): 142 packets read (133003744 bytes); 142 frames decoded;
  Input stream #0:1 (audio): 267 packets read (227951 bytes); 267 frames decoded (273408 samples);
  Input stream #0:2 (data): 0 packets read (0 bytes);
  Total: 409 packets (133231695 bytes) demuxed
Input file #1 (/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov):
  Input stream #1:0 (video): 142 packets read (133003744 bytes); 142 frames decoded;
  Input stream #1:1 (audio): 267 packets read (227951 bytes); 267 frames decoded (273408 samples);
  Input stream #1:2 (data): 0 packets read (0 bytes);
  Total: 409 packets (133231695 bytes) demuxed
Input file #2 (/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov):
  Input stream #2:0 (video): 142 packets read (133003744 bytes); 142 frames decoded;
  Input stream #2:1 (audio): 267 packets read (227951 bytes); 267 frames decoded (273408 samples);
  Input stream #2:2 (data): 0 packets read (0 bytes);
  Total: 409 packets (133231695 bytes) demuxed
Input file #3 (anullsrc=r=48000:cl=stereo):
  Input stream #3:0 (audio): 0 packets read (0 bytes);
  Total: 0 packets (0 bytes) demuxed
Output file #0 (/tmp/output.mp4):
  Output stream #0:0 (video): 427 frames encoded; 427 packets muxed (4431401 bytes);
  Output stream #0:1 (audio): 801 frames encoded (820224 samples); 802 packets muxed (212903 bytes);
  Total: 1229 packets (4644304 bytes) muxed
[libx264 @ 0x7f8f3d01d600] frame I:9     Avg QP:24.44  size: 40665
[libx264 @ 0x7f8f3d01d600] frame P:111   Avg QP:27.69  size: 16127
[libx264 @ 0x7f8f3d01d600] frame B:307   Avg QP:28.80  size:  7409
[libx264 @ 0x7f8f3d01d600] consecutive B-frames:  3.5%  1.4%  1.4% 93.7%
[libx264 @ 0x7f8f3d01d600] mb I  I16..4: 33.7% 60.9%  5.4%
[libx264 @ 0x7f8f3d01d600] mb P  I16..4:  5.9%  8.2%  0.5%  P16..4: 32.3%  5.0%  3.0%  0.0%  0.0%    skip:45.1%
[libx264 @ 0x7f8f3d01d600] mb B  I16..4:  2.5%  5.7%  0.0%  B16..8: 16.9%  1.6%  0.0%  direct:12.2%  skip:61.0%  L0:44.6% L1:52.6% BI: 2.7%
[libx264 @ 0x7f8f3d01d600] final ratefactor: 26.28
[libx264 @ 0x7f8f3d01d600] 8x8 transform intra:63.6% inter:89.1%
[libx264 @ 0x7f8f3d01d600] coded y,uvDC,uvAC intra: 28.5% 50.8% 8.1% inter: 7.2% 11.5% 0.0%
[libx264 @ 0x7f8f3d01d600] i16 v,h,dc,p: 27% 23%  9% 42%
[libx264 @ 0x7f8f3d01d600] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 21% 19% 28%  5%  5%  6%  6%  6%  4%
[libx264 @ 0x7f8f3d01d600] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 19% 16%  7%  8%  7%  9%  5%  3%
[libx264 @ 0x7f8f3d01d600] i8c dc,h,v,p: 60% 25% 11%  4%
[libx264 @ 0x7f8f3d01d600] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x7f8f3d01d600] ref P L0: 70.5% 29.5%
[libx264 @ 0x7f8f3d01d600] ref B L0: 86.8% 13.2%
[libx264 @ 0x7f8f3d01d600] ref B L1: 96.3%  3.7%
[libx264 @ 0x7f8f3d01d600] kb/s:2075.27
[aac @ 0x7f8f3d03e600] Qavg: 523.747


More information about the ffmpeg-user mailing list