[FFmpeg-devel] [PATCH][RFC] Lagarith Decoder.
Loren Merritt
lorenm
Fri Sep 18 13:20:08 CEST 2009
On Fri, 18 Sep 2009, Loren Merritt wrote:
> On Fri, 18 Sep 2009, Vitor Sessak wrote:
>> Nathan Caldwell wrote:
>>
>>> Ok, so the values that cause this are:
>>> rac->prob[i] = 0x6eb
>>> scale_factor = 16
>>> cumul_prob = 0xfd00
>>
>> So I suppose the attached file reproduces your problem, no? Does using
>> func2() in the decoder makes it bitexact (no, using it is not ok, but it
>> helps debugging)?
>
> No. The Lagarith reference source code says "double", and doesn't even
> attempt portability, so when compiled on x86_32 (which the only official
> binary is) it uses 80bit long double.
Huh, I guess I misinterpreted the disassembly. double works, long double
doesn't.
And here's a portable implementation of something that's hopefully what we
want.
softfloat_reciprocal alone exactly matches sse but not x87 (even with
ffloat-store), and I don't know what the difference is. I failed to find a
set of integer inputs for which that difference affects the final prob[].
(But that's just a unit test; not extensively tested in the decoder other
than on the one case I reported before.)
--Loren Merritt
-------------- next part --------------
diff --git a/libavcodec/lagarith.c b/libavcodec/lagarith.c
index 74ef093..fdd10f4 100644
--- a/libavcodec/lagarith.c
+++ b/libavcodec/lagarith.c
@@ -48,6 +48,30 @@ static av_cold int lag_decode_init(AVCodecContext *avctx)
return 0;
}
+/* compute the 52bit mantissa of 1/(double)denom */
+static uint64_t softfloat_reciprocal(uint32_t denom)
+{
+ int shift = av_log2(denom-1)+1;
+ uint64_t ret = (1ULL<<52) / denom;
+ uint64_t err = (1ULL<<52) - ret*denom;
+ ret <<= shift;
+ err <<= shift;
+ err += denom/2;
+ return ret + err/denom;
+}
+
+/* (uint32_t)(x*f), where f has the given mantissa, and exponent 0 */
+static uint32_t softfloat_mul(uint32_t x, uint64_t mantissa)
+{
+ uint64_t l = x*(mantissa&0xffffffff);
+ uint64_t h = x*(mantissa>>32);
+ h += l>>32;
+ l &= 0xffffffff;
+ l += 1<<av_log2(h>>21);
+ h += l>>32;
+ return h>>20;
+}
+
static void lag_memset(uint8_t *s, uint8_t c, size_t n, int step)
{
int i;
@@ -143,13 +167,13 @@ static int lag_read_prob_header(lag_rac *rac, GetBitContext *gb)
scale_factor = av_log2(cumul_prob);
if (cumul_prob & (cumul_prob - 1)) {
- scale_factor++;
+ uint64_t mul = softfloat_reciprocal(cumul_prob);
for (i = 1; i < 257; i++) {
- rac->prob[i] =
- ((uint64_t) rac->prob[i] << scale_factor) / cumul_prob;
+ rac->prob[i] = softfloat_mul(rac->prob[i], mul);
scaled_cumul_prob += rac->prob[i];
}
+ scale_factor++;
cumulative_target = 1 << scale_factor;
if (scaled_cumul_prob > cumulative_target) {
More information about the ffmpeg-devel
mailing list