[FFmpeg-devel] GSoC

Sun Mar 11 13:18:36 EET 2018

On 11/03/18 04:36, Dylan Fernando wrote:
> On Thu, Mar 8, 2018 at 8:57 AM, Mark Thompson <sw at jkqxz.net> wrote:
> 
>> On 07/03/18 03:56, Dylan Fernando wrote:
>>> Thanks, it works now
>>>
>>> Would trying to implement an OpenCL version of vf_fade be a good idea
>> for a
>>> qualification task, or would it be a better idea to try a different
>> filter?
>>
>> That sounds like a sensible choice to me, though if you haven't written a
>> filter before you might find it helpful to write something simpler first to
>> understand how it fits together (for example: vflip, which has trivial
>> processing parts but still needs the surrounding boilerplate).
>>
>> - Mark
>>
>> (PS: be aware that top-posting is generally frowned upon on this mailing
>> list.)
>>
>>
>>> On Wed, Mar 7, 2018 at 1:20 AM, Mark Thompson <sw at jkqxz.net> wrote:
>>>
>>>> On 06/03/18 12:37, Dylan Fernando wrote:
>>>>> Hi,
>>>>>
>>>>> I am Dylan Fernando. I am a Computer Science student from Australia. I
>> am
>>>>> new to FFmpeg and I wish to apply for GSoC this year.
>>>>> I would like to do the Video filtering with OpenCL project and I have a
>>>> few
>>>>> questions. Would trying to implement an opencl version of vf_fade be a
>>>> good
>>>>> idea for the qualification task, or would I be better off using a
>>>> different
>>>>> filter?
>>>>>
>>>>> Also, I’m having a bit of trouble with running unsharp_opencl. I tried
>>>>> running:
>>>>> ffmpeg -hide_banner -nostats -v verbose -init_hw_device opencl=ocl:0.1
>>>>> -filter_hw_device ocl -i space.mpg -filter_complex unsharp_opencl
>>>> output.mp4
>>>>>
>>>>> but I got the error:
>>>>> [AVHWDeviceContext @ 0x7fdac050c700] 0.1: Apple / Intel(R) Iris(TM)
>>>>> Graphics 6100
>>>>> [mpeg @ 0x7fdac3132600] max_analyze_duration 5000000 reached at 5005000
>>>>> microseconds st:0
>>>>> Input #0, mpeg, from 'space.mpg':
>>>>>   Duration: 00:00:21.99, start: 0.387500, bitrate: 6108 kb/s
>>>>>     Stream #0:0[0x1e0]: Video: mpeg2video (Main), 1 reference frame,
>>>>> yuv420p(tv, bt470bg, bottom first, left), 720x480 [SAR 8:9 DAR 4:3],
>> 6000
>>>>> kb/s, 29.97 fps, 29.97 tbr, 90k tbn, 59.94 tbc
>>>>> Stream mapping:
>>>>>   Stream #0:0 (mpeg2video) -> unsharp_opencl
>>>>>   unsharp_opencl -> Stream #0:0 (mpeg4)
>>>>> Press [q] to stop, [?] for help
>>>>> [graph 0 input from stream 0:0 @ 0x7fdac0418800] w:720 h:480
>>>> pixfmt:yuv420p
>>>>> tb:1/90000 fr:30000/1001 sar:8/9 sws_param:flags=2
>>>>> [auto_scaler_0 @ 0x7fdac05232c0] w:iw h:ih flags:'bilinear' interl:0
>>>>> [Parsed_unsharp_opencl_0 @ 0x7fdac0715a80] auto-inserting filter
>>>>> 'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and
>>>> the
>>>>> filter 'Parsed_unsharp_opencl_0'
>>>>> Impossible to convert between the formats supported by the filter
>> 'graph
>>>> 0
>>>>> input from stream 0:0' and the filter 'auto_scaler_0'
>>>>> Error reinitializing filters!
>>>>> Failed to inject frame into filter network: Function not implemented
>>>>> Error while processing the decoded data for stream #0:0
>>>>> Conversion failed!
>>>>>
>>>>> How do I correctly run unsharp_opencl? Should I be running it on a
>>>>> different video file?
>>>>
>>>> It's intended to be used in filter graphs where much of the activity is
>>>> already happening on the GPU, so the input and output are in the
>>>> AV_PIX_FMT_OPENCL format which contains GPU-side OpenCL images.
>>>>
>>>> If you want to use it standalone then you need hwupload and hwdownload
>>>> filters to move the frames between the CPU and GPU.  For your example,
>> it
>>>> should work with:
>>>>
>>>> ffmpeg -init_hw_device opencl=ocl:0.1 -filter_hw_device ocl -i space.mpg
>>>> -filter_complex hwupload,unsharp_opencl,hwdownload output.mp4
>>>>
>>>> (There are constraints on what formats can be used and therefore
>> suitable
>>>> files (or required format conversions), but I believe a normal yuv420p
>>>> video like this should work in all cases.)
>>>>
>>>> - Mark
>>
> 
> Thanks.
> 
> How is AV_PIX_FMT_OPENCL formatted? When using read_imagef(), does xyzw
> correspond to RGBA respectively, or to YUV? Would I have to account for
> different formats? If so, how do I check the format of the input?

See libavutil/hwcontext_opencl.c and in particular the functions opencl_get_buffer(), opencl_pool_alloc() and opencl_get_plane_format() for the code creating the AV_PIX_FMT_OPENCL images.

It tries to support all formats which are representable as OpenCL images, so the component values are dependent on what the format of the underlying image is.  What can actually be represented does depends a bit on the implementation - for example, CL_R channel order is needed for all planar YUV images, and CL_RG is needed as well for NV12 and P010 support.  The data_type is always UNORM_INT8 or UNORM_INT16 (depending on depth, intermediate depths like 10-bit require are treated as UNORM_INT16 and require an MSB-packed format like P010 rather than an LSB-packed format like YUV420P), so it should always be read as a float (float2, float4) in the CL kernels.

Given that, if you have kernels which are not dependent on interactions between components then you don't actually need to care about the underlying format - use float4 everywhere and what's actually in xyzw doesn't matter.  See the program_opencl examples <http://ffmpeg.org/ffmpeg-filters.html#program_005fopencl-1> for some cases of this, and the unsharp_opencl filter is also close to this (it only cares about luma vs. chroma planes).

If on the other hand you do need to know exactly where the components are, then you will need to look at the sw_format of the incoming hw_frames_ctx (it's on the input link at when the config_inputs function is called on a filter input pad).  If you can't easily support all formats then rejecting unsupported ones here with a suitable error message is fine (there isn't currently any negotiation of that format, so it will be up to the user to get it into the right state).  With only one or a small number of formats there then you can know exactly what is in the xyzw components and therefore use them however you like.

Hope this helps,

- Mark