[FFmpeg-devel] [PATCH] Added QSV based VPP filter - second try
Ivan Uskov
ivan.uskov at nablet.com
Thu Nov 5 16:35:33 CET 2015
Hello wm4,
Thursday, November 5, 2015, 5:07:08 PM, you wrote:
>>
>> >> > > + } else if (ret == MFX_WRN_DEVICE_BUSY) {
>> >> > > + av_usleep(500);
>> >> >
>> >> > What. Use proper event-based waiting.
>> It is not possible.
>> >>
>> >> That´s the same behavior as we have in the qsv encoder and decoder.
>> >> And as far as I know this is how Intel recommends to handle this.
>>
>> w> That's just ridiculous. Can you send some hate-mail to Intel and tell
>> w> them what a bad idea this is? Half a millisecond is an eternity for a
>> w> CPU. What if the device is blocked only for 10 microseconds? Then it
>> w> will waste time by spending 490 microseconds idly.
>> 1. Please remember we use GPU, not CPU.
w> That makes it even worse, because the CPU could literally be entirely
w> idle.
Not mandatory.
There are following scenarios are possible:
1. transcoding completely executes by QSV components. At this case CPU always
almost idle and it is not issue at all. It is most probably scenario when we
theoretically can get MFX_WRN_DEVICE_BUSY and CPU loading does not matter.
2. SW components like encoder or decoder works together with QSV
components. At this case possible a scenario when GPU is busy but CPU still
executes some thread pool (inside SW encoder for example).
>> 2. 500us means that even we will get MFX_WRN_DEVICE_BUSY at every frame we
>> will able to achieve ~2000fps performance. It looks enough
>> performance level for any practical applications.
w> Only if all other CPU processing takes 0 microseconds.
Here can be other threads which will very happy if we will slip until GPU
busy. Also we never will get MFX_WRN_DEVICE_BUSY at each frame.
I just would like to point that delay has not big impact to real performance
which usually much less than 2000fps.
>> 3. In real life MFX_WRN_DEVICE_BUSY does appear when GPU really busy by
>> other tasks. So nothing bad will appear if one thread/process will sleep for
>> 500us to make another thread complete its work.
>>
>> w> Software engineers recognized that polling is a bad idea half a century
>> w> ago. Why can't Intel do this right?
>> May be because it is complex to organize event-polling when calculations
>> performs in GPU?
w> Even just making the call blocking would be 1. easier, 2. more
w> efficient (because it will idle only as long as needed).
I believe Intel had serious reasons do not implement blocking here.
General processing in QSV is asynchronous and has nice functions to
wait completion of encoding/decoding/processing.
If Intel made MFX_WRN_DEVICE_BUSY as immediately return code without
event handling and still keep it as is during library 16 releases that it
has the reason.
For example here can be a small penalty during general processing which will
give visible overhead for hundreds frames per second.
For any case do not have an ability to change this API.
--
Best regards,
Ivan mailto:ivan.uskov at nablet.com
More information about the ffmpeg-devel
mailing list