[FFmpeg-devel] Input requested on floating point decomposition for AAC Main

Tue Nov 11 22:30:47 CET 2008

"Alex Converse" <alex.converse at gmail.com> writes:

> On Tue, Nov 11, 2008 at 3:26 PM, M?ns Rullg?rd <mans at mansr.com> wrote:
>> "Alex Converse" <alex.converse at gmail.com> writes:
>>
>>> On Tue, Nov 11, 2008 at 5:21 AM, Jason Garrett-Glaser
>>> <darkshikari at gmail.com> wrote:
>>>> On Mon, Nov 10, 2008 at 10:39 PM, Alex Converse <alex.converse at gmail.com> wrote:
>>>>> To do 16-bit floating point rounding for AAC-Main, I need a function
>>>>> that will decompose a float into a normalized scheme and it's
>>>>> exponent. Conveniently there exists the x87 instruction FXTRACT for
>>>>> this very purpose.
>>>>
>>>> Are you sure this is a good idea?  FXTRACT takes 170 clock cycles on
>>>> Core 2, according to Agner, making it the single slowest floating
>>>> point operation in x86 history, perhaps competing with FBSTP.
>>>>
>>>
>>> Hmm, FXTRACT may be slow but it appears to not be as slow as frexpf.
>>> On my Core2 Duo I'm getting:
>>> 40092570 dezicycles in t_fxtract, 2 runs, 0 skips
>>> 142385760 dezicycles in t_frexpf, 2 runs, 0 skips
>>>
>>> for this test code:
>>> float t1() {
>>>     float j, f, g;
>>>     START_TIMER("t_fxtract");
>>>     for (j = 1.0f; j <= 1048576.0f; j++)
>>>         g = fxtract(j, &f);
>>>     STOP_TIMER("t_fxtract");
>>>     return g;
>>> }
>>>
>>> float t2() {
>>>     int i, j;
>>>     float h;
>>>     START_TIMER("t_frexpf");
>>>     for (j = 1.0f; j <= 1048576.0f; j++)
>>>         h = frexpf(j, &i);
>>>     STOP_TIMER("t_frexpf");
>>>     return h;
>>> }
>>
>> What do frexpf() and fxtract() contain?
>>
>
> frexpf() is the standard libm function, see man frexp.

Yes, but what, exactly, does yours contain?

-- 
M?ns Rullg?rd
mans at mansr.com