[FFmpeg-devel] [HACK] 50% faster H.264 decoding
Vitor Sessak
vitor1001
Mon Aug 23 12:18:47 CEST 2010
Luca Barbato wrote:
> On 08/23/2010 07:39 AM, Jason Garrett-Glaser wrote:
>> How in the world would instrinsics solve that problem? There is no
>> compiler in the world that will magically rewrite your algorithm to
>> use completely different instructions on a given architecture.
>
> You described what a compiler generally does...
>
> you use c = a + b; not c = arch_specific_add_inst(a, b);
>
> Generic vector intrinsics do exist (and yes, the do suck right now)
What is the point of it? I mean, compare these two code snippets:
void sum(float *out, const float *in, int size)
{
vector float *o = out;
vector float *i = in;
size <<= 2;
while(size--)
*o++ += *i++;
}
void sum(float *out, const float *in, int size)
{
assert(!(in&15));
assert(!(out&15));
assert(!(size&3));
while(size--)
*out++ += *in++;
}
Why can't the compiler generate exactly the same ASM for both?
-Vitor
More information about the ffmpeg-devel
mailing list