[Ffmpeg-devel] Re: [PATCH] Machine endian bytestream functions
Ramiro Ribeiro Polla
ramiro
Fri Apr 13 23:41:07 CEST 2007
Hello,
Ramiro Polla wrote:
> Hello,
>
> Michael Niedermayer escreveu:
>> Hi
>>
>> On Sat, Mar 10, 2007 at 05:15:44PM -0300, Ramiro Polla wrote:
>>
>>> Hello,
>>>
>>> Reimar D?ffinger escreveu:
>>>
>>>> Hello,
>>>> On Sat, Mar 10, 2007 at 11:06:41PM -0300, ramiro at lisha.ufsc.br wrote:
>>>>
>>>>
>>>>> Attached patch makes the AV_{R,W}{L,B}xx macros have a machine
>>>>> endian for
>>>>> the simple 16 and 32 bit types. Those macros are then #ifdef'd for
>>>>> the
>>>>> correct endianess. 24 bit remains the same, as it would be more
>>>>> complex.
>>>>>
>>>> They completely ignore alignment issues...
>>>>
>>>>
>>>>
>>> You're right.
>>>
>>> Attached patch makes use of machine endianess where unaligned data
>>> accesses are possible, and faster than what gcc is currently doing.
>>>
>>> I have only tested this on a p4, but the following program should
>>> detect this on any architecture. Compile bytes.c and main.c with the
>>> same options FFmpeg gives to libavcodec files, link them, and test
>>> the speed both for patched and unpatched FFmpeg. bytes.c should be
>>> changed to 'be' on big-endian architectures.
>>>
>>> Regression tests pass.
>>>
>>> Ramiro Polla
>>>
>>
>>
>>> Index: configure
>>> ===================================================================
>>> --- configure (revis?o 8316)
>>> +++ configure (c?pia de trabalho)
>>> @@ -602,6 +602,7 @@
>>> dlopen
>>> fast_64bit
>>> fast_cmov
>>> + fast_unaligned
>>> freetype2
>>> imlib2
>>> inet_aton
>>> @@ -737,6 +738,7 @@
>>> mmx="default"
>>> cmov="no"
>>> fast_cmov="no"
>>> +fast_unaligned="no"
>>> armv5te="default"
>>> armv6="default"
>>> iwmmxt="default"
>>> @@ -951,9 +953,11 @@
>>> case "$arch" in
>>> i386|i486|i586|i686|i86pc|BePC)
>>> arch="x86_32"
>>> + enable fast_unaligned
>>> ;;
>>> x86_64|amd64)
>>> arch="x86_32"
>>> + enable fast_unaligned
>>> canon_arch="`$cc -dumpmachine | sed -e 's,\([^-]*\)-.*,\1,'`"
>>> if [ x"$canon_arch" = x"x86_64" -o x"$canon_arch" = x"amd64" ];
>>> then
>>> if [ -z "`echo $CFLAGS | grep -- -m32`" ]; then
>>>
>>
>> maybe configure should rather have a generic test which checks which
>> version
>> is faster? (it would be much easier to maintain instead of keeping
>> track what
>> is faster for which cpu ...)
>>
>>
>>
>
> Sorry, but I failed to find a simple way for this in configure. Three
> issues came up:
> 1. Unaligned data accesses will crash on some processors, and I don't
> think it's a good idea to have configure throw exceptions. (e.g. it
> would open the "Send report" dialog on Windows).
> 2. Checking the speed for an x ammount of time would slow down configure.
> 3. Different cpu loads during the configure script would cause
> unreliable results.
>
> So, for the moment, I'm sending the patch with the same check. (It's
> just like the fast_64bit or fast_cmov check).
>
>
> cosmetics.diff reorders definitions from endianess to bit-depth.
> functional.diff makes special cases for fast_unaligned.
>
functional.diff changed configure to use the new check_exec_crash
function. This should detect if unaligned data access doesn't crash, and
if it returns the correct non-rotated values (as I read in [1]).
depends on cosmetics.diff from previous message.
Ramiro Polla
[1] http://www.arm.com/support/faqdev/1469.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: functional_3.diff
Type: text/x-patch
Size: 2704 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070413/1d832efa/attachment.bin>
More information about the ffmpeg-devel
mailing list