[FFmpeg-devel] Fix bug for POWER8 LE for the test '/libavutil/float_altivec-test'

Thu Sep 11 19:59:08 CEST 2014

On Thu, Sep 11, 2014 at 7:18 AM, Grace Ryan <rongyan236 at gmail.com> wrote:
>   Hi,
>
> I present this patch, which is to fix the bug for the calculation of
> 'ff_vector_fmul_add_altivec' when run '/libavutil/float_altivec-test' on
> POWER8 little endian.
>
> The fate test result can be found on http://fate.ffmpeg.org/ by search
> "ibmcrl", also attached here to facilitate the review:
>
>
>
> The passed test cases increased from 1633/2166 to 1648/2166.
> The patch file is also attached.

> -    vector float d, s0, s1, s2, t0, t1, edges;
> -    vector unsigned char align = vec_lvsr(0,dst),
> -                         mask = vec_lvsl(0, dst);
> +    vector float d, ss0, ss1, ss2, t0, t1, edges;

I am not a big fan of renaming variables without benefits, especially when...

>
>     for (i = 0; i < len - 3; i += 4) {
>         t0 = vec_ld(0, dst + i);
>         t1 = vec_ld(15, dst + i);
> -        s0 = vec_ld(0, src0 + i);
> -        s1 = vec_ld(0, src1 + i);
> -        s2 = vec_ld(0, src2 + i);
> -        edges = vec_perm(t1, t0, mask);
> -        d = vec_madd(s0, s1, s2);
> -        t1 = vec_perm(d, edges, align);
> -        t0 = vec_perm(edges, d, align);
> +        ss0 = vec_ld(0, src0 + i);
> +        ss1 = vec_ld(0, src1 + i);
> +        ss2 = vec_ld(0, src2 + i);
> +        edges = vec_perm(t1, t0, vcprm(0, 1, 2, 3));
> +        d = vec_madd(ss0, ss1, ss2);

> +        t1 = vec_perm(d, edges, vcprm(s0,s1,s2,s3));
> +        t0 = vec_perm(edges, d, vcprm(s0,s1,s2,s3));

...you are using the old variable names, which I am pretty sure won't compile.

Timothy