[FFmpeg-devel] [PATCH] Port mp=eq/eq2 to FFmpeg

Thu Jan 22 20:27:38 CET 2015

On Thu, 22 Jan 2015 21:45:04 +0530
arwa arif <arwaarif1994 at gmail.com> wrote:

> On Thu, Jan 22, 2015 at 2:59 PM, wm4 <nfxjfg at googlemail.com> wrote:
> 
> > On Thu, 22 Jan 2015 01:38:11 +0530
> > arwa arif <arwaarif1994 at gmail.com> wrote:
> >
> > > From 703cc1887903c2868537e19e99b76927bec07884 Mon Sep 17 00:00:00 2001
> > > From: Arwa Arif <arwaarif1994 at gmail.com>
> > > Date: Mon, 19 Jan 2015 03:56:48 +0530
> > > Subject: [PATCH] Port mp=eq/eq2 to FFmpeg
> > >
> > > Code adapted from James Darnley's previous commits
> >
> > > [...]
> >
> > > +#if HAVE_MMX && HAVE_6REGS
> > > +static void process_MMX(EQParameters *param, uint8_t *dst, int
> > dst_stride,
> > > +                        uint8_t *src, int src_stride, int w, int h)
> > > +{
> > > +        int i;
> > > +        int pel;
> > > +        int dstep = dst_stride - w;
> > > +        int sstep = src_stride - w;
> > > +        short brvec[4];
> > > +        short contvec[4];
> > > +
> > > +        brvec[0] = brvec[1] = brvec[2] = brvec[3] = param->b;
> > > +        contvec[0] = contvec[1] = contvec[2] = contvec[3] = param->c;
> > > +
> > > +        while (h--) {
> > > +                __asm__ volatile (
> > > +                        "movq (%5), %%mm3      \n\t"
> > > +                        "movq (%6), %%mm4      \n\t"
> > > +                        "pxor %%mm0, %%mm0     \n\t"
> > > +                        "movl %4, %%eax        \n\t"
> > > +                        //ASMALIGN(4)
> > > +                        "1:                    \n\t"
> > > +                        "movq (%0), %%mm1      \n\t"
> > > +                        "movq (%0), %%mm2      \n\t"
> > > +                        "punpcklbw %%mm0, %%mm1\n\t"
> > > +                        "punpckhbw %%mm0, %%mm2\n\t"
> > > +                        "psllw $4, %%mm1       \n\t"
> > > +                        "psllw $4, %%mm2       \n\t"
> > > +                        "pmulhw %%mm4, %%mm1   \n\t"
> > > +                        "pmulhw %%mm4, %%mm2   \n\t"
> > > +                        "paddw %%mm3, %%mm1    \n\t"
> > > +                        "paddw %%mm3, %%mm2    \n\t"
> > > +                        "packuswb %%mm2, %%mm1 \n\t"
> > > +                        "add $8, %0            \n\t"
> > > +                        "movq %%mm1, (%1)      \n\t"
> > > +                        "add $8, %1            \n\t"
> > > +                        "decl %%eax            \n\t"
> > > +                        "jnz 1b                \n\t"
> > > +                        : "=r" (src), "=r" (dst)
> > > +                        : "0" (src), "1" (dst), "r" (w>>3), "r"
> > (brvec), "r" (contvec)
> > > +                        : "%eax"
> > > +                );
> > > +
> > > +                for (i = w&7; i; i--) {
> > > +                        pel = ((*src++ * param->c) >> 12) + param->b;
> > > +                        if (pel & 768)
> > > +                            pel = (-pel) >> 31;
> > > +                        *dst++ = pel;
> > > +                }
> > > +
> > > +                src += sstep;
> > > +                dst += dstep;
> > > +        }
> > > +        __asm__ volatile ( "emms \n\t" ::: "memory" );
> > > +}
> > > +#endif
> >
> > IMO the asm shouldn't be added at all. It's not important enough for a
> > ported video filter, but has high maintenance overhead - there's a
> > reason the FFmpeg loathes inline asm, right?
> >
> > How much does the inline asm help with speed? When I tested it on a
> > reasonably modern CPU a while ago, there was barely any advantage.
> >
> >
> I checked the runtime of the codes with and without asm, it turns out that
> there is not much difference. The difference is coming out to be in
> milliseconds.
> 
> 26.014s with asm and
> 26.129 without asm.
> 
>  So, should I remove the asm part?

Then I'd definitely vote for remove.

The asm probably mattered on ancient CPUs and ancient compilers, but
there's no reason to keep it anymore.

> 
> > > +av_cold void ff_eq_init_x86(EQContext *eq)
> > > +{
> > > +#if HAVE_MMX_INLINE
> > > +    int cpu_flags = av_get_cpu_flags();
> > > +
> > > +    if (cpu_flags & AV_CPU_FLAG_MMX) {
> > > +        eq->process = process_MMX;
> > > +    }
> > > +#endif
> > > +}
> >