[FFmpeg-devel] [PATCH v3 1/2] dxva: wait until D3D11 buffer copies are done before submitting them
Steve Lhomme
robux4 at ycbcr.xyz
Wed Aug 12 15:04:41 EEST 2020
On 2020-08-11 12:43, Steve Lhomme wrote:
>>> Sorry if you seem to know all the answers already, but I don't and so
>>> I have to
>>> investigate.
>>
>> Last year, I had literally worked this down to death. I followed every
>> slightest
>> hint from countless searches, read through hundreds of discussions,
>> driven
>> because I was unwilling to believe that up-/downloading of video
>> textures with
>> D3D11 can't be done equally fast as with D3D9.
>> (the big picture was the implementation of D3D11 support for QuickSync
>> where
>> the slowdown played a much bigger role than with D3D11VA decoders only).
>> Eventually I landed at some internal Nvidia presentation, some talks
>> with MS
>> guys and some source code discussion deep inside a 3D game engine (not a
>> no-name). It really bugs me that I didn't properly note the
>> references, but
>> from somewhere in between I was able to gather solid evidence about what
>> is legal to do and what Is not. Based on that, followed several
>> iterations to
>> find the optimal way for doing the texture transfer. As I had implemented
>> D3D11 support for QuickSync, this got pretty complicated because with
>> a full transcoding pipeline, all parts (decoder, encoder and filters)
>> can (and
>> usually will) request textures. Only the latest Intel Drivers can work
>> with
>> array textures everywhere (e.g. VPP), so I also needed to add support for
>> non-array texture allocation. The patch you've seen is the result of
>> weeks
>> of intensive work (a small but crucial part of it) - even when it may not
>> look like that.
>>
>>
>>> Sorry if you seem to know all the answers already
>>
>> Obviously, I don't know all the answers, but all the answers I have given
>> were correct. And when I didn't have an answer I always respectfully
>> said that your situation might be different.
>> And I didn't reply by implying that you would have done your work
>> by trial-and-error or most likely invalid assumptions or deductions.
>>
>>
>> I still don't know how you are actually operating this and thus I also
>> cannot
>> tell what might or might not work in your case.
>> All I can tell is that the procedure that I have described (1-2-3-4) can
>> work rock-solid for multi-threaded DX11 texture transfer when it's
>> done in
>> the same way as I've shown.
>> And believe it or not - I would still be happy when it would be
>> of any use for you...
>
> Even though the discussion is heated (fitting with the weather here) I
> don't mind. I learned some stuff and it pushed me to dig deeper. I can't
> just accept your word for it. I need something solid if I'm going to
> remove a lock that helped me so far.
>
> So I'm currently tooling VLC to be able to bring the decoder to its
> knees and find out what it can and cannot do safely. So far I can still
> see decoding artifacts when I don't a lock, which would mean I still
> need the mutex, for the reasons given in the previous mail.
A follow-up on this. Using ID3D10Multithread seems to be enough to have
mostly thread safe ID3D11Device/ID3D11DeviceContext/etc. Even the
decoding with its odd API seem to know what to do when submitted
different buffers.
I did not manage to saturate the GPU but I much bigger decoding
speed/throughput to validate the errors I got before. Many of them were
due to VLC dropping data because of odd timing.
Now I still have some threading issues. For example for deinterlacing we
create a ID3D11VideoProcessor to handle the deinterlacing. And we create
it after the decoding started (as the deinterlacing can be
enabled/disabled dynamically). Without the mutex in the decoder it
crashes on ID3D11VideoDevice::CreateVideoProcessor() and
ID3D11VideoContext::SubmitDecoderBuffers() as they are being called
simultaneously. If I add the mutex between the decoder and just this
filter (not the rendering side) it works fine.
So I guess I'm stuck with the mutex for the time being.
Here is the stack trace on an Intel 630 GPU:
igd11dxva64.dll!00007ffc384a8d24() (Unknown Source:0)
igd11dxva64.dll!00007ffc38452030() (Unknown Source:0)
igd11dxva64.dll!00007ffc3845a081() (Unknown Source:0)
igd11dxva64.dll!00007ffc38465a27() (Unknown Source:0)
igd11dxva64.dll!00007ffc386067d2() (Unknown Source:0)
igd11dxva64.dll!00007ffc3883c9f3() (Unknown Source:0)
igd11dxva64.dll!00007ffc3867145a() (Unknown Source:0)
igd11dxva64.dll!00007ffc3866ea23() (Unknown Source:0)
igd11dxva64.dll!00007ffc3881b4ac() (Unknown Source:0)
igd11dxva64.dll!00007ffc384f7bdc() (Unknown Source:0)
igd11dxva64.dll!00007ffc384fa2a5() (Unknown Source:0)
igd11dxva64.dll!00007ffc3847a334() (Unknown Source:0)
d3d11.dll!00007ffcabc33e8d() (Unknown Source:0)
d3d11.dll!00007ffcabc3389d() (Unknown Source:0)
d3d11_3SDKLayers.dll!00007ffc3184fa6b() (Unknown Source:0)
calling ID3D11VideoContext::SubmitDecoderBuffers()
libavcodec_plugin.dll!ff_dxva2_common_end_frame(AVCodecContext * avctx,
AVFrame * frame, const void * pp, unsigned int pp_size, const void * qm,
unsigned int qm_size, int(*)(AVCodecContext *, void *, void *)
commit_bs_si) Line 1085
(c:\Users\robux\Documents\Programs\Videolabs\build\win64\contrib\contrib-win64\ffmpeg\libavcodec\dxva2.c:1085)
libavcodec_plugin.dll!dxva2_h264_end_frame(AVCodecContext * avctx) Line
507
(c:\Users\robux\Documents\Programs\Videolabs\build\win64\contrib\contrib-win64\ffmpeg\libavcodec\dxva2_h264.c:507)
libavcodec_plugin.dll!ff_h264_field_end(H264Context * h,
H264SliceContext * sl, int in_setup) Line 171
(c:\Users\robux\Documents\Programs\Videolabs\build\win64\contrib\contrib-win64\ffmpeg\libavcodec\h264_picture.c:171)
libavcodec_plugin.dll!h264_decode_frame(AVCodecContext * avctx, void *
data, int * got_frame, AVPacket * avpkt) Line 1015
(c:\Users\robux\Documents\Programs\Videolabs\build\win64\contrib\contrib-win64\ffmpeg\libavcodec\h264dec.c:1015)
libavcodec_plugin.dll!decode_simple_internal(AVCodecContext * avctx,
AVFrame * frame) Line 432
(c:\Users\robux\Documents\Programs\Videolabs\build\win64\contrib\contrib-win64\ffmpeg\libavcodec\decode.c:432)
win32u.dll!00007ffcb0054784() (Unknown Source:0)
gdi32.dll!00007ffcb1e03860() (Unknown Source:0)
d3d11.dll!00007ffcabc756ee() (Unknown Source:0)
d3d11.dll!00007ffcabc5c811() (Unknown Source:0)
igd11dxva64.dll!00007ffc385c5043() (Unknown Source:0)
igd11dxva64.dll!00007ffc384abaa5() (Unknown Source:0)
igd11dxva64.dll!00007ffc384ab7ab() (Unknown Source:0)
igd11dxva64.dll!00007ffc38453b27() (Unknown Source:0)
igd11dxva64.dll!00007ffc384611e6() (Unknown Source:0)
igd11dxva64.dll!00007ffc385cca30() (Unknown Source:0)
igd11dxva64.dll!00007ffc384bb303() (Unknown Source:0)
igd11dxva64.dll!00007ffc3847ccff() (Unknown Source:0)
d3d11.dll!00007ffcabc3e661() (Unknown Source:0)
d3d11.dll!00007ffcabc3d39f() (Unknown Source:0)
d3d11.dll!00007ffcabc3d0cd() (Unknown Source:0)
d3d11.dll!00007ffcabc68a46() (Unknown Source:0)
d3d11.dll!00007ffcabc5955d() (Unknown Source:0)
d3d11_3SDKLayers.dll!00007ffc318a263c() (Unknown Source:0)
d3d11_3SDKLayers.dll!00007ffc3189479a() (Unknown Source:0)
d3d11_3SDKLayers.dll!00007ffc3184e749() (Unknown Source:0)
d3d11.dll!00007ffcabc59d0c() (Unknown Source:0)
d3d11.dll!00007ffcabc3c606() (Unknown Source:0)
d3d11_3SDKLayers.dll!00007ffc3187dd0e() (Unknown Source:0)
calling ID3D11VideoDevice::CreateVideoProcessor()
libdirect3d11_filters_plugin.dll!D3D11OpenDeinterlace(vlc_object_t *
obj) Line 297
(c:\Users\robux\Documents\Programs\Videolabs\vlc\modules\hw\d3d11\d3d11_deinterlace.c:297)
libvlccore.dll!generic_start(void * func, bool forced, char * ap) Line
294
(c:\Users\robux\Documents\Programs\Videolabs\vlc\src\modules\modules.c:294)
libvlccore.dll!module_load(vlc_logger * log, module_t * m, int(*)(void
*, bool, char *) init, bool forced, char * args) Line 212
(c:\Users\robux\Documents\Programs\Videolabs\vlc\src\modules\modules.c:212)
libvlccore.dll!vlc_module_load(vlc_logger * log, const char *
capability, const char * name, bool strict, int(*)(void *, bool, char *)
probe, ...) Line 265
(c:\Users\robux\Documents\Programs\Videolabs\vlc\src\modules\modules.c:265)
libvlccore.dll!module_need(vlc_object_t * obj, const char * cap, const
char * name, bool strict) Line 305
(c:\Users\robux\Documents\Programs\Videolabs\vlc\src\modules\modules.c:305)
libvlccore.dll!filter_chain_AppendInner(filter_chain_t * chain, const
char * name, const char * capability, config_chain_t * cfg, const
es_format_t * fmt_out) Line 254
(c:\Users\robux\Documents\Programs\Videolabs\vlc\src\misc\filter_chain.c:254)
libvlccore.dll!filter_chain_AppendFilter(filter_chain_t * chain, const
char * name, config_chain_t * cfg, const es_format_t * fmt_out) Line 299
(c:\Users\robux\Documents\Programs\Videolabs\vlc\src\misc\filter_chain.c:299)
libvlccore.dll!ThreadChangeFilters(vout_thread_sys_t * vout, const char
* filters, const bool * new_deinterlace, bool is_locked) Line 992
(c:\Users\robux\Documents\Programs\Videolabs\vlc\src\video_output\video_output.c:992)
libvlccore.dll!Thread(void * object) Line 1891
(c:\Users\robux\Documents\Programs\Videolabs\vlc\src\video_output\video_output.c:1891)
libvlccore.dll!vlc_entry(void * p) Line 360
(c:\Users\robux\Documents\Programs\Videolabs\vlc\src\win32\thread.c:360)
msvcrt.dll!00007ffcb139af5a() (Unknown Source:0)
msvcrt.dll!00007ffcb139b02c() (Unknown Source:0)
kernel32.dll!00007ffcb21d6fd4() (Unknown Source:0)
ntdll.dll!00007ffcb23bcec1() (Unknown Source:0)
More information about the ffmpeg-devel
mailing list