[FFmpeg-devel] [PATCH] M68K: Optimized MUL64/MULH/MULLfunctions for 68060
Måns Rullgård
mans
Sun Aug 2 00:43:22 CEST 2009
ami_stuff <ami_stuff at o2.pl> writes:
>> > :"=d"(lo), "=d"(hi)
>>
>> Those should be marked early-clobber (&).
>
> Ok.
>
>> > :"0"(a), "1"(b)
>>
>> Do these have to be the same regs? Allowing different registers
>> theoretically gives the compiler better room for optimal register
>> allocation. On the other hand, it gives the compiler more room to
>> mess up.
>
> It looks like GCC 4.4.1 generates better code with defined registers
> (2 move.ls less):
See below.
>> > :"d2", "d3", "d4", "d5");
>>
>> Avoid using hardcoded registers, and prefer explicitly declared temp
>> variables.
>
> Hmm, I don't know how to do it
int t1, t2, t3, t4;
asm("..." : "=&d"(t1), "=&d"(t2), "=&d"(t3), "=&d"(t4));
> and what code GCC will generate after this change.
Try and see.
> Now the output asm code looks pefrect without any unneeded
> instructions.
That's because you're looking at this function in isolation. When
inlined in a larger function, those registers may well already be in
use with some others free.
>> Out of interest, what does gcc do when left to its own devices?
>
> You mean how output asm code looks alike without asm inlines? In
> this situation GCC uses slow _muldi3.
Oh...
>> > #define MULL(a,b,s) (MUL64(a, b) >> s)
>>
>> Can gcc really be trusted with this?
>
> inline int MULL(int a, int b, unsigned s){
> return MUL64(a,b)>>s;
> }
>
> Here is output from asm-optimized function:
>
> #NO_APP
> [...]
> #NO_APP
> lea (-32,a0),a1
> tst.l a1
> jlt L2
> move.l a1,d1
> asr.l d1,d0
> movem.l (sp)+,#60
> rts
> L2:
> move.l d0,d2
> add.l d2,d2
> moveq #31,d0
> sub.l a0,d0
> lsl.l d0,d2
> move.l d1,d0
> move.l a0,d3
> lsr.l d3,d0
> or.l d2,d0
That's quite a lot for a right shift. We also happen to know the
shift is always a constant and less than 32. GCC will of course
theoretically have this information when the function is inlined, so
we should be looking at code generated by such a call, not this
function compiled standalone.
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list