[FFmpeg-devel] pre discussion around Blackfin dct_quantize_bfin routine
Marc Hoffman
mmhoffm
Tue Jun 12 17:47:33 CEST 2007
On 6/12/07, Reimar Doeffinger <Reimar.Doeffinger at stud.uni-karlsruhe.de> wrote:
> Hello,
> On Tue, Jun 12, 2007 at 09:27:22AM -0400, Marc Hoffman wrote:
> [...]
> > O really, I have never seen such a problem interesting. Anyways I'm
> > sure it exists, however this is for a specific machine which I know
> > this works for. Based on that fact what are your thought, or do you
> > have a suggestion other than the wasteful use of electrons: hi<<32ll
> > | lo?
>
> Have you tested the code this generates? Because at least for the Atmel
> 8 bit stuff gcc optimizes this perfectly, so no need to avoid it.
I guess the compiler needs some support around DImode and the asm stuff.
unsigned long long read_time (void)
{
unsigned long long t0;
unsigned lo,hi;
asm volatile ("%0=cycles; %1=cycles2;" : "=d" (lo), "=d" (hi));
t0 = lo;
t0 |= (unsigned long long)hi << 32;
return t0;
}
Generates the following output codes, I would have thought that the
compiler would have done something a little bit more efficient than
this.
.text;
.align 4
.global _read_time;
.type _read_time, STT_FUNC;
_read_time:
LINK 0;
#APP
R0=cycles; R2=cycles2;
#NO_APP
R1 = 0 (X);
R3 = R2;
R2 = 0 (X);
R0 = R0 | R2;
R1 = R1 | R3;
UNLINK;
rts;
.size _read_time, .-_read_time
.ident "GCC: (GNU) 4.1.1 (ADI 07R1)"
I'm not sure why the compiler is compelled to use logical or....
Even the structure based method produces better code, its still not optimial.
unsigned long long read_time1 (void)
{
union {
struct {
unsigned lo;
unsigned hi;
} p;
unsigned long long c;
} t;
asm ("%0=cycles; %1=cycles2;" : "=d" (t.p.lo), "=d" (t.p.hi));
return t.c;
}
.type _read_time1, STT_FUNC;
_read_time1:
LINK 0;
#APP
R2=cycles; R3=cycles2;
#NO_APP
UNLINK;
R0 = R2;
R1 = R3;
rts;
.size _read_time1, .-_read_time1
.ident "GCC: (GNU) 4.1.1 (ADI 07R1)"
yoda:~ mmh$
And at the end of the day all we really want is this:
_read_timex:
LINK 0;
#APP
R0=cycles; R1=cycles2;
#NO_APP
UNLINK;
rts;
With all that said can I use the struct/union on Blackfin its
supported correctly and produces the correct results. We are arguing
about something which is not relevant here because the union is being
used to do a hardware specific mapping which I believe is acceptable.
Once the compiler is correct I will change it. Actually the code that
should work is actually something like this:
static inline uint64_t read_time (void)
{
unsigned long long t0;
asm volatile ("%0=cycles; %H0=cycles2;" : "=D" (t0));
return t0;
}
Which unfortunately doesn't work atm.
Marc
More information about the ffmpeg-devel
mailing list