[FFmpeg-devel] [PATCH] avcodec: use av_mod_intp2() where possible
James Almer
jamrial at gmail.com
Tue Apr 21 23:51:52 CEST 2015
On 21/04/15 6:38 PM, Michael Niedermayer wrote:
> On Tue, Apr 21, 2015 at 05:45:25PM -0300, James Almer wrote:
>> Signed-off-by: James Almer <jamrial at gmail.com>
>> ---
>> libavcodec/acelp_vectors.c | 8 +++-----
>> libavcodec/adpcm.c | 2 +-
>> libavcodec/alacenc.c | 3 +--
>> libavcodec/atrac3plus.c | 4 ++--
>> libavcodec/dnxhdenc.c | 2 +-
>> libavcodec/dvenc.c | 2 +-
>> libavcodec/ffv1.h | 2 +-
>> libavcodec/ffv1dec.c | 3 +--
>> libavcodec/g726.c | 2 +-
>> libavcodec/g729dec.c | 2 +-
>> libavcodec/golomb.h | 2 +-
>> libavcodec/h264.c | 5 ++---
>> libavcodec/h264_refs.c | 2 +-
>> libavcodec/hevc.c | 26 ++++++++++++--------------
>> libavcodec/hevc_cabac.c | 8 ++++----
>> libavcodec/hevc_filter.c | 15 ++++++---------
>> libavcodec/hevc_mvs.c | 4 ++--
>> libavcodec/hevc_ps.c | 4 ++--
>> libavcodec/hevc_refs.c | 5 ++---
>> libavcodec/hevcpred_template.c | 4 ++--
>> libavcodec/mpeg12enc.c | 8 ++++----
>> libavcodec/opus.h | 2 +-
>> libavcodec/opus_celt.c | 9 ++++-----
>> libavcodec/proresenc_kostya.c | 6 ++----
>> libavcodec/put_bits.h | 2 +-
>> libavcodec/vorbisdec.c | 5 ++---
>> 26 files changed, 61 insertions(+), 76 deletions(-)
>>
>> diff --git a/libavcodec/acelp_vectors.c b/libavcodec/acelp_vectors.c
>> index 86851a3..7aef8c7 100644
>> --- a/libavcodec/acelp_vectors.c
>> +++ b/libavcodec/acelp_vectors.c
>> @@ -133,12 +133,11 @@ void ff_acelp_fc_pulse_per_track(
>> int pulse_count,
>> int bits)
>> {
>> - int mask = (1 << bits) - 1;
>> int i;
>>
>> for(i=0; i<pulse_count; i++)
>> {
>> - fc_v[i + tab1[pulse_indexes & mask]] +=
>> + fc_v[i + tab1[av_mod_uintp2(pulse_indexes, bits)]] +=
>> (pulse_signs & 1) ? 8191 : -8192; // +/-1 in (2.13)
>>
>> pulse_indexes >>= bits;
>> @@ -154,13 +153,12 @@ void ff_decode_10_pulses_35bits(const int16_t *fixed_index,
>> int half_pulse_count, int bits)
>> {
>> int i;
>> - int mask = (1 << bits) - 1;
>>
>> fixed_sparse->no_repeat_mask = 0;
>> fixed_sparse->n = 2 * half_pulse_count;
>> for (i = 0; i < half_pulse_count; i++) {
>> - const int pos1 = gray_decode[fixed_index[2*i+1] & mask] + i;
>> - const int pos2 = gray_decode[fixed_index[2*i ] & mask] + i;
>> + const int pos1 = gray_decode[av_mod_uintp2(fixed_index[2*i+1], bits)] + i;
>> + const int pos2 = gray_decode[av_mod_uintp2(fixed_index[2*i ], bits)] + i;
>> const float sign = (fixed_index[2*i+1] & (1 << bits)) ? -1.0 : 1.0;
>> fixed_sparse->x[2*i+1] = pos1;
>> fixed_sparse->x[2*i ] = pos2;
> [...]
>> diff --git a/libavcodec/ffv1.h b/libavcodec/ffv1.h
>> index 5081397..bfc4d71 100644
>> --- a/libavcodec/ffv1.h
>> +++ b/libavcodec/ffv1.h
>> @@ -143,7 +143,7 @@ static av_always_inline int fold(int diff, int bits)
>> diff = (int8_t)diff;
>> else {
>> diff += 1 << (bits - 1);
>> - diff &= (1 << bits) - 1;
>> + diff = av_mod_uintp2(diff, bits);
>> diff -= 1 << (bits - 1);
>> }
>>
>
> iam not sure some of these changes are are a good idea
> for example above the mask has to be calculated anyway what can be
> gained from av_mod_uintp2() use here ?
> i think this should be bechmarked when its in performance critical code
"1 << (bits - 1)" is not the same as "(1 << bits) - 1". The latter is not going to be
calculated if the arch optimized version of av_mod_uintp2 is used.
The one file where i don't know if there's going to be any gain is in one of the opus_celt.c
changes, where the mask needs to be calculated even if arch optimized av_mod_uintp2 is used.
That one can be removed, but every other change is pretty straightforward.
More information about the ffmpeg-devel
mailing list