[FFmpeg-devel] Fix bug for POWER8 LE for the test '/libavutil/float_altivec-test'
Timothy Gu
timothygu99 at gmail.com
Thu Sep 11 19:59:08 CEST 2014
On Thu, Sep 11, 2014 at 7:18 AM, Grace Ryan <rongyan236 at gmail.com> wrote:
> Hi,
>
> I present this patch, which is to fix the bug for the calculation of
> 'ff_vector_fmul_add_altivec' when run '/libavutil/float_altivec-test' on
> POWER8 little endian.
>
> The fate test result can be found on http://fate.ffmpeg.org/ by search
> "ibmcrl", also attached here to facilitate the review:
>
>
>
> The passed test cases increased from 1633/2166 to 1648/2166.
> The patch file is also attached.
> - vector float d, s0, s1, s2, t0, t1, edges;
> - vector unsigned char align = vec_lvsr(0,dst),
> - mask = vec_lvsl(0, dst);
> + vector float d, ss0, ss1, ss2, t0, t1, edges;
I am not a big fan of renaming variables without benefits, especially when...
>
> for (i = 0; i < len - 3; i += 4) {
> t0 = vec_ld(0, dst + i);
> t1 = vec_ld(15, dst + i);
> - s0 = vec_ld(0, src0 + i);
> - s1 = vec_ld(0, src1 + i);
> - s2 = vec_ld(0, src2 + i);
> - edges = vec_perm(t1, t0, mask);
> - d = vec_madd(s0, s1, s2);
> - t1 = vec_perm(d, edges, align);
> - t0 = vec_perm(edges, d, align);
> + ss0 = vec_ld(0, src0 + i);
> + ss1 = vec_ld(0, src1 + i);
> + ss2 = vec_ld(0, src2 + i);
> + edges = vec_perm(t1, t0, vcprm(0, 1, 2, 3));
> + d = vec_madd(ss0, ss1, ss2);
> + t1 = vec_perm(d, edges, vcprm(s0,s1,s2,s3));
> + t0 = vec_perm(edges, d, vcprm(s0,s1,s2,s3));
...you are using the old variable names, which I am pretty sure won't compile.
Timothy
More information about the ffmpeg-devel
mailing list