[FFmpeg-devel] Input requested on floating point decomposition for AAC Main
Måns Rullgård
mans
Tue Nov 11 22:30:47 CET 2008
"Alex Converse" <alex.converse at gmail.com> writes:
> On Tue, Nov 11, 2008 at 3:26 PM, M?ns Rullg?rd <mans at mansr.com> wrote:
>> "Alex Converse" <alex.converse at gmail.com> writes:
>>
>>> On Tue, Nov 11, 2008 at 5:21 AM, Jason Garrett-Glaser
>>> <darkshikari at gmail.com> wrote:
>>>> On Mon, Nov 10, 2008 at 10:39 PM, Alex Converse <alex.converse at gmail.com> wrote:
>>>>> To do 16-bit floating point rounding for AAC-Main, I need a function
>>>>> that will decompose a float into a normalized scheme and it's
>>>>> exponent. Conveniently there exists the x87 instruction FXTRACT for
>>>>> this very purpose.
>>>>
>>>> Are you sure this is a good idea? FXTRACT takes 170 clock cycles on
>>>> Core 2, according to Agner, making it the single slowest floating
>>>> point operation in x86 history, perhaps competing with FBSTP.
>>>>
>>>
>>> Hmm, FXTRACT may be slow but it appears to not be as slow as frexpf.
>>> On my Core2 Duo I'm getting:
>>> 40092570 dezicycles in t_fxtract, 2 runs, 0 skips
>>> 142385760 dezicycles in t_frexpf, 2 runs, 0 skips
>>>
>>> for this test code:
>>> float t1() {
>>> float j, f, g;
>>> START_TIMER("t_fxtract");
>>> for (j = 1.0f; j <= 1048576.0f; j++)
>>> g = fxtract(j, &f);
>>> STOP_TIMER("t_fxtract");
>>> return g;
>>> }
>>>
>>> float t2() {
>>> int i, j;
>>> float h;
>>> START_TIMER("t_frexpf");
>>> for (j = 1.0f; j <= 1048576.0f; j++)
>>> h = frexpf(j, &i);
>>> STOP_TIMER("t_frexpf");
>>> return h;
>>> }
>>
>> What do frexpf() and fxtract() contain?
>>
>
> frexpf() is the standard libm function, see man frexp.
Yes, but what, exactly, does yours contain?
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list