[FFmpeg-devel] [PATCH] lavc/aacenc_utils: unroll abs_pow34_v loop
Ganesh Ajjanagadde
gajjanag at gmail.com
Tue Mar 22 19:15:25 CET 2016
On Sat, Mar 19, 2016 at 9:09 AM, Reimar Döffinger
<Reimar.Doeffinger at gmx.de> wrote:
> On Sat, Mar 19, 2016 at 12:42:09PM +0100, Clément Bœsch wrote:
>> On Fri, Mar 18, 2016 at 10:12:14PM -0700, Ganesh Ajjanagadde wrote:
>> > -static inline void abs_pow34_v(float *av_restrict out, const float *av_restrict in, const int size)
>> > -{
>> > - int i;
>> > - for (i = 0; i < size; i++) {
>> > - float a = fabsf(in[i]);
>> > - out[i] = sqrtf(a * sqrtf(a));
>> > - }
>> > -}
>> > -
>> > static inline float pos_pow34(float a)
>> > {
>> > return sqrtf(a * sqrtf(a));
>> > }
>> >
>> > +static inline void abs_pow34_v(float *av_restrict out, const float *av_restrict in, const int size)
>> > +{
>> > + av_assert2(!(size % 4));
>> > + for (int i = 0; i < size; i+=4) {
>> > + float a0 = fabsf(in[i]);
>> > + float a1 = fabsf(in[i+1]);
>> > + float a2 = fabsf(in[i+2]);
>> > + float a3 = fabsf(in[i+3]);
>> > + out[i ] = pos_pow34(a0);
>> > + out[i+1] = pos_pow34(a1);
>> > + out[i+2] = pos_pow34(a2);
>> > + out[i+3] = pos_pow34(a3);
>> > + }
>> > +}
>> > +
>>
>> I'm curious (and lazy), is GCC able to unroll by itself if you hint it
>> with a loop such as:
>>
>> int i;
>> for (i = 0; i < size & ~3; i++) {
>> float a = fabsf(in[i]);
>> out[i] = sqrtf(a * sqrtf(a));
>> }
Does not help, yields ~ 140 decicycles like earlier.
>
> I haven't been able to to figure out for
> sure for this one, but at least the other one
> Debian gcc 5.3.1 already unrolls and vectorizes
> for me, though it has a bit of extra code to
> handle cases where size is not a multiple of 4.
I suspect the speed change in that case is coming from the removal of
such extra code, as I am running gcc 5.3.0 on Arch.
> So I suspect "which gcc?" is probably an important
> question.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
More information about the ffmpeg-devel
mailing list