[Ffmpeg-devel] BlackFin lowlevel pixel operations PATCH
Diego Biurrun
diego
Sun Apr 1 14:16:36 CEST 2007
On Sun, Apr 01, 2007 at 07:34:04AM -0400, Marc Hoffman wrote:
Content-Description: message body text
> Marc Hoffman writes:
> >
> > Diego Biurrun writes:
> > > On Fri, Mar 30, 2007 at 07:32:14AM -0400, Marc Hoffman wrote:
> > > >
> > > > --- bfin/dsputil_bfin.c (revision 8517)
> > > > +++ bfin/dsputil_bfin.c (working copy)
> > > > @@ -18,38 +21,296 @@
> > > >
> > > > +static void bfin_idct_add (uint8_t *dest, int line_size, DCTELEM *block)
> > > > +{
> > > > + ff_bfin_idct (block);
> > > > + ff_bfin_add_pixels_clamped (block, dest, line_size);
> > > > +}
> > >
> > > Here and everywhere else: FFmpeg coding style mandates 4 space
> > > indentation.
> > >
> >
> > No problem, is that for every indentation level or just the first level?
> >
> dsputils indented.
Please use four spaces everywhere.
Some spelling/grammar nitpicks below.
> --- bfin/fdct_bfin.S (revision 0)
> +++ bfin/fdct_bfin.S (revision 0)
> @@ -0,0 +1,383 @@
> +
> +This is the implementation of Chen's algorithm of DCT. It is based on
> +the separable nature of DCT for multi- dimension. The input matrix is
> +8x8 real data. First, one dime- sional 8-point DCT is calculated for
stray -
> +transpose. Then again 8-point DCT is calculated on each row of
of the
> +matrix. The output is again stored in a transpose matrix. This is
> +final output.
the final output
> +addition and subtraction is done with one multiplication. In the
> +third and last (fourth) stages more MAC operations involved.
get/are involved
> +The algorithm operates in-placed.
in-place
> +transposing the matrix and calculation of bit reversed are carried out
reversal
> +Output of function is provided "in" buffer in normal order.
in "in"
> +This function takes a two argument the first of which is the base
takes two arguments, the first
> +\item[ptr\_dct\_input] \par input signal, 16bit input samples.
> +\item[ptr\_dct\_input] \par output signal, 16bit output samples.
> +\item[ptr\_dct_\coefs] \par dct coefficient matrix.
> +\item[ptr\_dct\_temp] \par temporary output buffer for middle stage.
No need to add periods here.
> + FP - Should point to the begining of the jpeg encode structure
begiNNing, JPEG
> + M0 = 12 (X); // All these initialization are used in the
initializationS
> + // prescale the input to get the precision correct
correct precision
> +* a loop of 2 iteration (DCT_strt, DCT_end) is set.
iterationS
> +* teration. The input is read from "in" buffer and output is written to
iteration
> + P1 = B2; // P1 pointts to temporary array
points
> +* Where notation (x, y) means the element from column x is in upper half of
> +* register and element from column y is in lower half of the register.
the upper/lower half
> +* The following two instruction does the job of stage 3 -
instructionS
> +* A1 = Element 0 * cos(pi/4)
> +* A0 = Element 0 * cos(pi/4)
Align this. Also, is there a typo? Are A0 and A1 really the same?
> +* calculation is done outside the loop, after wards it is done here. It
afterwards
> +* serves two purpose.
purposeS
> +* Firts it computes part 1 for the next data, and it writes the data 5, and
First
> --- bfin/idct_bfin.S (revision 0)
> +++ bfin/idct_bfin.S (revision 0)
> @@ -0,0 +1,422 @@
> +Description : This is the implementation of Chen's algorithm of IDCT.
Umm, I didn't look through this in detail, but this description appears
to be a (near) duplicate of the one in bfin/fdct_bfin.S. That's not
good.
Diego
More information about the ffmpeg-devel
mailing list