[FFmpeg-devel] [PATCH/RFC] intreadwrite.h rewrite
Luca Barbato
lu_zero
Mon Apr 6 09:20:53 CEST 2009
M?ns Rullg?rd wrote:
> I would like to propose a rework of intreadwrite.h. This new version
> supports per-arch implementations of the various macros allowing us to
> take advantage of special instructions or other properties the
> compiler does not know about.
>
> ARMv6 and later support unaligned loads and stores for single
> word/halfword but not double/multiple. GCC is ignorant of this and
> will always use bytewise accesses for unaligned data. Casting to an
> int32_t pointer is dangerous since a load/store double or multiple
> instruction might be used (this happens with some code in FFmpeg).
> Implementing the AV_[RW]* macros with inline asm using only supported
> instructions gives fast and safe unaligned accesses. This gives an
> overall speedup of up to 10% in some cases.
>
> PPC is normally big endian but has special little endian load/store
> instructions. Using these avoids a separate byteswap. This makes the
> vorbis decoder about 5% faster. Not much else uses little-endian
> read/write extensively. GCC generates horrible PPC code for the
> default AV_[RW]B64 (which uses a packed struct), so I have overridden
> it with a plain pointer cast.
>
> For other architectures the definitions of these macros should remain
> unchanged.
>
> I'm attaching the complete files instead of a diff since the diff
> since this is largely a rewrite. Sorry for attaching three files with
> same name; I'm sure you can work out which is which.
Nice, I assume HAVE_LDBRX comes from a configure check, isn't it?
lu
--
Luca Barbato
Gentoo Council Member
Gentoo/linux Gentoo/PPC
http://dev.gentoo.org/~lu_zero
More information about the ffmpeg-devel
mailing list