[FFmpeg-devel] [PATCH 3/3] lavc/pthread_frame: rework the logic for updating thread contexts
Michael Niedermayer
michael at niedermayer.cc
Thu Nov 14 21:37:56 EET 2024
On Wed, Nov 13, 2024 at 02:06:58PM +0100, Anton Khirnov wrote:
> Propagating decoder state between per-thread contexts with frame
> threading currently works as follows:
> 0) Every frame thread has its own "child" decoder context,
> 1) Frame thread T0 decodes the frame header and updates its context
> accordingly. At most one frame thread can be in this stage at any
> given time.
> 2) Frame thread T0 calls ff_thread_finish_setup() to indicate that
> header decoding is done.
> 3a) Frame thread T0 proceeds with decoding frame data.
> 3b) The main thread calls the decoder's update_thread_context()
> callback, transferring T0's state to the next thread T1.
>
> Since 3a) and 3b) run concurrently, during 3a) T0 must not write to any
> context variables accessed by update_thread_context(), otherwise a data
> race occurs. This approach turns out to be highly fragile in practice,
> as developers are either not aware of this constraint, or fail to keep
> it in mind while modifying decoders.
>
> This commit aims to eliminate the possibility of such races by changing
> the logic in the folowing way:
> * child decoders are no longer permanently bound to worker threads, but
> are instead assigned dynamically only while decoding a single frame; a
> different decoder may be assigned to the same thread for a later frame
> * with N frame threads, N+1 child decoder contexts are allocated
> (instead of N, as before), so at any time at least one decoder is
> idle (unassigned)
> * when a frame thread calls ff_thread_finish_setup(), its context state
> is immediately and synchronously transferred to the idle context
> * the idle context is then assigned to the next frame thread, whose
> previous decoder now becomes idle
>
> With this approach, improperly updating a decoder context after
> ff_thread_finish_setup() transforms from a race into a deterministic
> failure to propagate the relevant variables to following frame threads,
> which
> * should be much easier to debug
> * is no longer UB
> ---
> libavcodec/pthread_frame.c | 142 +++++++++++++++++++++----------------
> 1 file changed, 81 insertions(+), 61 deletions(-)
Segfaults:
[h264 @ 0xa213ac0] ==3430921== Thread 2 av:h264:df0:
==3430921== Invalid read of size 4
==3430921== at 0xC64D63: ff_thread_report_progress (pthread_frame.c:666)
==3430921== by 0x1100CD3: h264_field_start (h264_slice.c:1472)
==3430921== by 0x11038FE: ff_h264_queue_decode_slice (h264_slice.c:2153)
==3430921== by 0xA6AA24: decode_nal_units (h264dec.c:657)
==3430921== by 0xA6C03F: h264_decode_frame (h264dec.c:1070)
==3430921== by 0x99FC82: decode_simple_internal (decode.c:443)
==3430921== by 0x9A01E1: decode_simple_receive_frame (decode.c:613)
==3430921== by 0x9A038B: ff_decode_receive_frame_internal (decode.c:649)
==3430921== by 0xC63BD4: frame_worker_thread (pthread_frame.c:317)
==3430921== by 0x692C608: start_thread (pthread_create.c:477)
==3430921== by 0x6A66352: clone (clone.S:95)
==3430921== Address 0x134 is not stack'd, malloc'd or (recently) free'd
==3430921==
==3430921==
==3430921== Process terminating with default action of signal 11 (SIGSEGV)
==3430921== Access not within mapped region at address 0x134
==3430921== at 0xC64D63: ff_thread_report_progress (pthread_frame.c:666)
==3430921== by 0x1100CD3: h264_field_start (h264_slice.c:1472)
==3430921== by 0x11038FE: ff_h264_queue_decode_slice (h264_slice.c:2153)
==3430921== by 0xA6AA24: decode_nal_units (h264dec.c:657)
==3430921== by 0xA6C03F: h264_decode_frame (h264dec.c:1070)
==3430921== by 0x99FC82: decode_simple_internal (decode.c:443)
==3430921== by 0x9A01E1: decode_simple_receive_frame (decode.c:613)
==3430921== by 0x9A038B: ff_decode_receive_frame_internal (decode.c:649)
==3430921== by 0xC63BD4: frame_worker_thread (pthread_frame.c:317)
==3430921== by 0x692C608: start_thread (pthread_create.c:477)
==3430921== by 0x6A66352: clone (clone.S:95)
==3430921== If you believe this happened as a result of a stack
==3430921== overflow in your program's main thread (unlikely but
==3430921== possible), you can try to increase the size of the
==3430921== main thread stack using the --main-stacksize= flag.
==3430921== The main thread stack size used in this run was 8388608.
Testcase:
-threads 2 -i ~/tickets/2927/h264_dead.avi -threads 1 -f null -
(if sample is not on trac or server, i can provide it)
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I do not agree with what you have to say, but I'll defend to the death your
right to say it. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20241114/b6e837b1/attachment.sig>
More information about the ffmpeg-devel
mailing list