[FFmpeg-devel] [PATCH] Fix mm_flags, mm_support for ARM
Michael Niedermayer
michaelni
Sat Jun 28 01:52:00 CEST 2008
On Fri, Jun 27, 2008 at 08:49:08PM +0100, M?ns Rullg?rd wrote:
> Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
>
> > On Friday 27 June 2008, matthieu castet wrote:
> >> Benoit Fouet wrote:
> >> > Michael Niedermayer wrote:
> >> >> On Thu, Jun 26, 2008 at 10:45:20PM +0200, matthieu castet wrote:
> >> >
> >> > Matthieu, could you please resend the complete patch ?
> >>
> >> I attach them as 2 separate patch.
> >> fix_dct-test-arm.diff : allow to build dct-test on arm
> >> arm-icdt-test.diff : add some arm idct (there are other version that
> >> could be added, but I have no hardware to test).
> >
> >> +#ifdef ARCH_ARMV4L
> >> + ?//{"IDCT_ARM", ? ? ? ?1, j_rev_dct_ARM, ? ? ?idct, MMX_PERM},
> >> /* should > be MMX_PERM perm, but produce strange result */
> >
> > Yes, 'j_rev_dct_ARM' is broken, it produces quite noticeable
> > artefacts on decoding video. I think I already reported this issue
> > here long ago. But as it is armv4 only, I was not interested in
> > investigating this problem further...
>
> It is indeed in pretty poor shape. It's pretty fast though, if that's
> any excuse.
>
> For completeness, here's the output of dct-test -i including all the
> ARM IDCT variants on a 500MHz Cortex-A8. It is interesting to note
> that the ones targeting older ARM cores are significantly slower than
> plain C on the Cortex.
dct-test is not a good benchmark, for serious benchmarking real data
should be used. Many idcts contain checks for zero elements ...
You can also see this difference between dct-test -i 0 and -i 1
[...]
> -7 15 -3 8 16 19 5 19
> -23 1 -15 10 26 4 -12 -7
> 2 -22 -5 -5 -4 10 4 12
> 4 -4 -15 -5 -2 -5 8 -6
> 10 17 -7 11 7 11 -1 -2
> -17 1 -25 -6 1 -2 19 12
> -3 7 3 21 3 -8 12 -24
> 21 -4 13 1 -5 10 8 -5
> IDCT SIMPLE-C: err_inf=1 err2=0.00667969 syserr=0.00130000 maxout=266 blockSumErr=64
> IDCT SIMPLE-C: 452.7 kdct/s
>
> -6 15 -2 9 16 20 5 21
> -23 1 -14 10 26 4 -11 -7
> 2 -21 -5 -5 -4 10 4 12
> 5 -4 -14 -2 -2 -4 9 -5
> 10 17 -7 12 7 11 -1 -2
> -17 2 -23 -6 2 -2 19 12
> -2 8 5 22 3 -8 12 -24
> 23 -2 13 1 -5 11 8 -3
> IDCT SIMPLE-ARM: err_inf=1 err2=0.00667500 syserr=0.00130000 maxout=266 blockSumErr=64
> IDCT SIMPLE-ARM: 369.6 kdct/s
[...]
> -6 15 -2 9 16 20 5 21
> -23 1 -14 10 26 4 -11 -7
> 2 -21 -5 -5 -4 10 4 12
> 5 -4 -14 -2 -2 -4 9 -5
> 10 17 -7 12 7 11 -1 -2
> -17 2 -23 -6 2 -2 19 12
> -2 8 5 22 3 -8 12 -24
> 23 -2 13 1 -5 11 8 -3
> IDCT SIMPLE-ARMV6: err_inf=1 err2=0.00667500 syserr=0.00130000 maxout=266 blockSumErr=64
> IDCT SIMPLE-ARMV6: 512.4 kdct/s
>
> -6 15 -2 9 16 20 5 21
> -23 1 -14 10 26 4 -11 -7
> 2 -21 -5 -5 -4 10 4 12
> 5 -4 -14 -2 -2 -4 9 -5
> 10 17 -7 12 7 11 -1 -2
> -17 2 -23 -6 2 -2 19 12
> -2 8 5 22 3 -8 12 -24
> 23 -2 13 1 -5 11 8 -3
> IDCT SIMPLE-NEON: err_inf=1 err2=0.00667500 syserr=0.00130000 maxout=266 blockSumErr=64
> IDCT SIMPLE-NEON: 783.9 kdct/s
These IDCTs are all buggy, they differ from the C simple idct.
As has been discussed on the ML already i belive it was
(skal, me and alexander).
We should try hard to avoid introducing a new IDCT with different "error
landscape".
Do we have someone who has a arm cpu and can look into the above issue?
Ideally would be the authors who claimed the code to be identical to the
C code ...
If we have noone then we will likely have to disable these IDCTs. I do not
want to create files that turn green and pink unless they are played on
an ARM cpu ...
Your patch is ok of course ...
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080628/121d8bd9/attachment.pgp>
More information about the ffmpeg-devel
mailing list