[FFmpeg-devel] [PATCH] remove useless mov in cabac
Alexander Strange
astrange
Mon Aug 10 22:14:45 CEST 2009
On Aug 9, 2009, at 4:51 AM, Reimar D?ffinger wrote:
> On Sun, Aug 09, 2009 at 02:16:44AM -0400, Alexander Strange wrote:
>> As in subject.
>>
>> The PTR_SUF isn't actually necessary for gcc; just using "add%z1"
>> instead of "add" is fine. But llvm doesn't implement %z and spits out
>> a rather long and angry error message, so I didn't want to break
>> FATE.
>
> Benchmark numbers? Your code adds an extra (L1 cache) memory read.
The extra read is easily absorbed since there's no dependencies on it.
Actually, on core 2 L1 cache can be faster to read than registers if
you run out of register read ports.
On merom:
Before:
1390 dezicycles in get_cabac_noinline, 32764 runs, 4 skips
1191 dezicycles in get_cabac_noinline, 65528 runs, 8 skips
1081 dezicycles in get_cabac_noinline, 131061 runs, 11 skips
992 dezicycles in get_cabac_noinline, 262129 runs, 15 skips
932 dezicycles in get_cabac_noinline, 524270 runs, 18 skips
889 dezicycles in get_cabac_noinline, 1048542 runs, 34 skips
863 dezicycles in get_cabac_noinline, 2097086 runs, 66 skips
843 dezicycles in get_cabac_noinline, 4194185 runs, 119 skips
After:
1299 dezicycles in get_cabac_noinline, 32764 runs, 4 skips
1073 dezicycles in get_cabac_noinline, 65529 runs, 7 skips
1025 dezicycles in get_cabac_noinline, 131061 runs, 11 skips
969 dezicycles in get_cabac_noinline, 262123 runs, 21 skips
923 dezicycles in get_cabac_noinline, 524253 runs, 35 skips
887 dezicycles in get_cabac_noinline, 1048508 runs, 68 skips
864 dezicycles in get_cabac_noinline, 2097015 runs, 137 skips
845 dezicycles in get_cabac_noinline, 4194074 runs, 230 skips
The last number is some weirdness in skip detection.
More information about the ffmpeg-devel
mailing list