[FFmpeg-devel] [PATCH]avfilter/unsharp_opencl

Lenny Wang lenny at multicorewareinc.com
Thu Nov 7 22:20:38 CET 2013


On Thu, Nov 7, 2013 at 2:42 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Thu, Nov 07, 2013 at 01:48:32PM -0600, Lenny Wang wrote:
>> On Thu, Nov 7, 2013 at 9:14 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> > On Wed, Nov 06, 2013 at 01:01:17AM -0600, Lenny Wang wrote:
>> >> Add optimized opencl kernels with greatly improved overall performance
>> >> observed on various mainstream platforms.
>> >
>> > [...]
>> >> @@ -225,16 +297,36 @@ int ff_opencl_unsharp_init(AVFilterContext *ctx)
>> >>          av_log(ctx, AV_LOG_ERROR, "Unable to get OpenCL command queue in filter 'unsharp'\n");
>> >>          return AVERROR(EINVAL);
>> >>      }
>> >> -    unsharp->opencl_ctx.program = av_opencl_compile("unsharp", NULL);
>> >> +    sprintf(build_opts, "-D LU_RADIUS_X=%d -D LU_RADIUS_Y=%d -D CH_RADIUS_X=%d -D CH_RADIUS_Y=%d",
>> >> +            2*unsharp->luma.steps_x+1, 2*unsharp->luma.steps_y+1, 2*unsharp->chroma.steps_x+1, 2*unsharp->chroma.steps_y+1);
>> >
>> > this should use snprintf(), for saftey
>> >
>> >
>> >
>> > [...]
>> >> @@ -32,7 +33,156 @@ inline unsigned char clip_uint8(int a)
>> >>          return a;
>> >>  }
>> >>
>> >> -kernel void unsharp(global  unsigned char *src,
>> >> +kernel void unsharp_luma(
>> >> +                    global unsigned char *src,
>> >> +                    global unsigned char *dst,
>> >> +                    global int *mask,
>> >> +                    int amount,
>> >> +                    int scalebits,
>> >> +                    int halfscale,
>> >> +                    int src_stride,
>> >> +                    int dst_stride,
>> >> +                    int width,
>> >> +                    int height)
>> >> +{
>> >> +    int2 threadIdx, blockIdx, globalIdx;
>> >> +    threadIdx.x = get_local_id(0);
>> >> +    threadIdx.y = get_local_id(1);
>> >> +    blockIdx.x = get_group_id(0);
>> >> +    blockIdx.y = get_group_id(1);
>> >> +    globalIdx.x = get_global_id(0);
>> >> +    globalIdx.y = get_global_id(1);
>> >> +
>> >> +    if (!amount) {
>> >> +        if (globalIdx.x < width && globalIdx.y < height)
>> >> +            dst[globalIdx.x + globalIdx.y*dst_stride] = src[globalIdx.x + globalIdx.y*src_stride];
>> >> +        return;
>> >> +    }
>> >> +
>> >> +    local uchar l[32][32];
>> >> +    local int lc[LU_RADIUS_X*LU_RADIUS_Y];
>> >> +    int indexIx, indexIy;
>> >> +
>> >> +    for(int i = 0; i <= 1; i++) {
>> >
>> > the variable should be declared outside the for( to ensure maximal
>> > compiler compatibility
>> >
>>
>> Patch modified based on Michael's comments.
>
>>  unsharp.h               |    6 +
>>  unsharp_opencl.c        |  180 ++++++++++++++++++++++++++++++++++++------------
>>  unsharp_opencl_kernel.h |  153 ++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 293 insertions(+), 46 deletions(-)
>> 3f411ca633e510075eee84fed6ef64dc5d3122e8  unsharp_ocl.patch
>>  libavfilter/unsharp.h               |   6 +-
>>  libavfilter/unsharp_opencl.c        | 180 +++++++++++++++++++++++++++---------
>>  libavfilter/unsharp_opencl_kernel.h | 153 +++++++++++++++++++++++++++++-
>>  3 files changed, 293 insertions(+), 46 deletions(-)
>
> please post a proper git patch with commit message
>
> [...9
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Those who are too smart to engage in politics are punished by being
> governed by those who are dumber. -- Plato
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-added-optimized-opencl-kernels-for-unsharp-opencl-fi.patch
Type: application/octet-stream
Size: 21213 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20131107/e8ee1f73/attachment.obj>


More information about the ffmpeg-devel mailing list