[FFmpeg-devel] [PATCH] split-radix FFT
Michael Niedermayer
michaelni
Tue Jul 29 20:45:53 CEST 2008
On Tue, Jul 29, 2008 at 07:39:25PM +0100, M?ns Rullg?rd wrote:
> Michael Niedermayer <michaelni at gmx.at> writes:
>
> > On Tue, Jul 29, 2008 at 05:20:15PM +0100, M?ns Rullg?rd wrote:
> >>
> >> Michael Niedermayer wrote:
> >> > On Tue, Jul 29, 2008 at 06:26:49PM +0300, Uoti Urpala wrote:
> >> >> On Tue, 2008-07-29 at 17:10 +0200, Michael Niedermayer wrote:
> >> >> > And just to clarify, yes what i considered a good argument was the
> >> >> sentance
> >> >> > above where my reply is. That is to use MANGLE in speed critical code.
> >> >> > That way most textrels are avoided while minimizing the speed impact.
> >> >> >
> >> >> > I do not think you ever argued for that.
> >> >>
> >> >> IIRC I did mention the possibility of omitting -fPIC for a subset of
> >> >> files.
> >> >>
> >> >> > I remember you strongly arguing
> >> >> > toward replacing all MANGLE by "m" knowing that it would break gcc 2.95
> >> >> > and not really caring that it would slow down code compiled with -fPIC.
> >> >>
> >> >> Of course the code would be slower on x86. If you want it to be as fast
> >> >> as possible then compile it with -fPIC on x86. I don't think it's
> >> >> worthwhile to pick only the globals used inside asm for such special
> >> >> treatment.
> >> >
> >> > x86-64 shared libs require -fPIC, unless that has been fixed.
> >>
> >> The x86-64 instruction set hasn't been "fixed", and I doubt it ever
> >> will be. You simply can't fit a 64-bit offset in a 32-bit immediate
> >> operand.
> >
> > Thats not what i meant
> >
> >>
> >> > so the user does not always have the option to omit -fPIC
> >>
> >> But in these cases, forcing a textrel will break the build.
> >
> > MANGLE forces rip relative addressing on x86-64 and thus avoids the
> > occasional GOT indirection gcc adds.
> >
> > Heres a example:
> > long globivar;
> >
> > void func(){
> > asm(
> > "mov globivar(%rip), %rax\n\t"
> > );
> > asm(
> > "mov %0, %%rax\n\t"
> > :: "m"(globivar)
> > );
> > }
> >
> > results in:
> > 0000000000000554 <func>:
> > 554: 55 push %rbp
> > 555: 48 89 e5 mov %rsp,%rbp
> > 558: 48 8b 05 d1 02 20 00 mov 0x2002d1(%rip),%rax # 200830 <globivar>
> > 55f: 48 8b 05 8a 02 20 00 mov 0x20028a(%rip),%rax # 2007f0 <_DYNAMIC+0x1b8>
> > 566: 48 8b 00 mov (%rax),%rax
> > 569: c9 leaveq
> > 56a: c3 retq
> >
> > you can see the second needs 2 instructions, the first just 1.
>
> There is no guarantee that &globivar is reachable with a 32-bit offset
> from %rip (or any other register).
libavcodec is still smaller than 4gb so it would work fine within and thats
the only case we really care about. I do not think any of our asm() accesses
globals from outside and if it does thats a seperate thing that can use "m"
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
When you are offended at any man's fault, turn to yourself and study your
own failings. Then you will forget your anger. -- Epictetus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080729/648fc14d/attachment.pgp>
More information about the ffmpeg-devel
mailing list