[Ffmpeg-devel] Re: Using Intel's fDCT

Sun Nov 20 22:06:32 CET 2005

On Sun, 20 Nov 2005 18:21:11 +0000 (UTC)
g. <the_ether at lycos.co.uk> wrote:

>  <g> writes:
> > 
> > Perhaps I have to add a new permutation step to the fDCT function before 
> > quantisation when using Intel's fDCT?
> > 
> > Can anyone explain what is going on?
> 
> The Intel fDCT is noticeably faster than ff_fdct_sse2 so there is evidently 
> some improvements that could be made to ff_fdct_sse2. However, ff_fdct_sse2 
> doesn't appear to do a straightforward transform and I couldn't find any 
> documentation or comments to explain what is going on.
> 
> To compare it with Inte's fDCT I fed in the following data:
> 
> DCTELEM input[64] = { 
> 0,0,0,0,0,0,0,0,
> 0,1,1,1,1,1,1,1,
> 0,1,2,2,2,2,2,2,
> 0,1,2,3,3,3,3,3,
> 0,1,2,3,4,4,4,4,
> 0,1,2,3,4,5,5,5,
> 0,1,2,3,4,5,6,6,
> 0,1,2,3,4,5,6,7 };
> 
> The results of fDCT by Intel's routine were:
> 
> Intel
> 18 -9 -2 -1 0 0 0 0 
> -9 7 0 0 0 0 0 0 
> -2 0 2 0 0 0 0 0 
> -1 0 0 1 0 0 0 0 
> 0 0 0 0 1 0 0 0 
> 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 
> And the results using ffmpeg's ff_fdct_sse2 were:
> 
> ffmpeg
> 140 -73 -18 -8 -4 -2 -1 -1 
> -72 53 0 1 0 0 0 0 
> -17 0 14 0 0 0 0 0 
> -8 0 0 7 0 0 0 0 
> -4 0 0 0 4 0 0 0 
> -3 0 0 0 0 3 0 0 
> -1 0 0 0 1 0 2 0 
> -1 0 0 0 0 0 0 2

Here, the Intel result is exactly the same as the ffmpeg one,
but quantized by a factor of 8.
So presumably, Intel's fDCT include the quantization step.
(Note that this is just rough guess)

Aurel