[FFmpeg-devel] [PATCH]avfilter/unsharp_opencl
Michael Niedermayer
michaelni at gmx.at
Thu Nov 7 22:31:48 CET 2013
On Thu, Nov 07, 2013 at 03:20:38PM -0600, Lenny Wang wrote:
> On Thu, Nov 7, 2013 at 2:42 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Thu, Nov 07, 2013 at 01:48:32PM -0600, Lenny Wang wrote:
> >> On Thu, Nov 7, 2013 at 9:14 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> > On Wed, Nov 06, 2013 at 01:01:17AM -0600, Lenny Wang wrote:
> >> >> Add optimized opencl kernels with greatly improved overall performance
> >> >> observed on various mainstream platforms.
> >> >
> >> > [...]
> >> >> @@ -225,16 +297,36 @@ int ff_opencl_unsharp_init(AVFilterContext *ctx)
> >> >> av_log(ctx, AV_LOG_ERROR, "Unable to get OpenCL command queue in filter 'unsharp'\n");
> >> >> return AVERROR(EINVAL);
> >> >> }
> >> >> - unsharp->opencl_ctx.program = av_opencl_compile("unsharp", NULL);
> >> >> + sprintf(build_opts, "-D LU_RADIUS_X=%d -D LU_RADIUS_Y=%d -D CH_RADIUS_X=%d -D CH_RADIUS_Y=%d",
> >> >> + 2*unsharp->luma.steps_x+1, 2*unsharp->luma.steps_y+1, 2*unsharp->chroma.steps_x+1, 2*unsharp->chroma.steps_y+1);
> >> >
> >> > this should use snprintf(), for saftey
> >> >
> >> >
> >> >
> >> > [...]
> >> >> @@ -32,7 +33,156 @@ inline unsigned char clip_uint8(int a)
> >> >> return a;
> >> >> }
> >> >>
> >> >> -kernel void unsharp(global unsigned char *src,
> >> >> +kernel void unsharp_luma(
> >> >> + global unsigned char *src,
> >> >> + global unsigned char *dst,
> >> >> + global int *mask,
> >> >> + int amount,
> >> >> + int scalebits,
> >> >> + int halfscale,
> >> >> + int src_stride,
> >> >> + int dst_stride,
> >> >> + int width,
> >> >> + int height)
> >> >> +{
> >> >> + int2 threadIdx, blockIdx, globalIdx;
> >> >> + threadIdx.x = get_local_id(0);
> >> >> + threadIdx.y = get_local_id(1);
> >> >> + blockIdx.x = get_group_id(0);
> >> >> + blockIdx.y = get_group_id(1);
> >> >> + globalIdx.x = get_global_id(0);
> >> >> + globalIdx.y = get_global_id(1);
> >> >> +
> >> >> + if (!amount) {
> >> >> + if (globalIdx.x < width && globalIdx.y < height)
> >> >> + dst[globalIdx.x + globalIdx.y*dst_stride] = src[globalIdx.x + globalIdx.y*src_stride];
> >> >> + return;
> >> >> + }
> >> >> +
> >> >> + local uchar l[32][32];
> >> >> + local int lc[LU_RADIUS_X*LU_RADIUS_Y];
> >> >> + int indexIx, indexIy;
> >> >> +
> >> >> + for(int i = 0; i <= 1; i++) {
> >> >
> >> > the variable should be declared outside the for( to ensure maximal
> >> > compiler compatibility
> >> >
> >>
> >> Patch modified based on Michael's comments.
> >
> >> unsharp.h | 6 +
> >> unsharp_opencl.c | 180 ++++++++++++++++++++++++++++++++++++------------
> >> unsharp_opencl_kernel.h | 153 ++++++++++++++++++++++++++++++++++++++++
> >> 3 files changed, 293 insertions(+), 46 deletions(-)
> >> 3f411ca633e510075eee84fed6ef64dc5d3122e8 unsharp_ocl.patch
> >> libavfilter/unsharp.h | 6 +-
> >> libavfilter/unsharp_opencl.c | 180 +++++++++++++++++++++++++++---------
> >> libavfilter/unsharp_opencl_kernel.h | 153 +++++++++++++++++++++++++++++-
> >> 3 files changed, 293 insertions(+), 46 deletions(-)
> >
> > please post a proper git patch with commit message
> >
> > [...9
> >
> > --
> > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> >
> > Those who are too smart to engage in politics are punished by being
> > governed by those who are dumber. -- Plato
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> unsharp.h | 6 +
> unsharp_opencl.c | 180 ++++++++++++++++++++++++++++++++++++------------
> unsharp_opencl_kernel.h | 153 ++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 293 insertions(+), 46 deletions(-)
> ad66fa550e989a26dba592addce23c94387902ba 0001-added-optimized-opencl-kernels-for-unsharp-opencl-fi.patch
> From 86468270d3bd841597d050838dcb5968c2f4cd0d Mon Sep 17 00:00:00 2001
> From: Lenny Wang <lwanghpc at gmail.com>
> Date: Thu, 7 Nov 2013 15:15:49 -0600
> Subject: [PATCH] added optimized opencl kernels for unsharp-opencl filter
patch applied
thanks
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Avoid a single point of failure, be that a person or equipment.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20131107/22d9d77e/attachment.asc>
More information about the ffmpeg-devel
mailing list