[FFmpeg-devel] [PATCH][RFC] Lagarith Decoder.

Fri Sep 18 13:20:08 CEST 2009

On Fri, 18 Sep 2009, Loren Merritt wrote:

> On Fri, 18 Sep 2009, Vitor Sessak wrote:
>> Nathan Caldwell wrote:
>> 
>>> Ok, so the values that cause this are:
>>> rac->prob[i] = 0x6eb
>>> scale_factor = 16
>>> cumul_prob = 0xfd00
>> 
>> So I suppose the attached file reproduces your problem, no? Does using 
>> func2() in the decoder makes it bitexact (no, using it is not ok, but it 
>> helps debugging)?
>
> No. The Lagarith reference source code says "double", and doesn't even 
> attempt portability, so when compiled on x86_32 (which the only official 
> binary is) it uses 80bit long double.

Huh, I guess I misinterpreted the disassembly. double works, long double 
doesn't.

And here's a portable implementation of something that's hopefully what we 
want.
softfloat_reciprocal alone exactly matches sse but not x87 (even with 
ffloat-store), and I don't know what the difference is. I failed to find a 
set of integer inputs for which that difference affects the final prob[].
(But that's just a unit test; not extensively tested in the decoder other 
than on the one case I reported before.)

--Loren Merritt
-------------- next part --------------

diff --git a/libavcodec/lagarith.c b/libavcodec/lagarith.c
index 74ef093..fdd10f4 100644
--- a/libavcodec/lagarith.c
+++ b/libavcodec/lagarith.c
@@ -48,6 +48,30 @@ static av_cold int lag_decode_init(AVCodecContext *avctx)
     return 0;
 }
 
+/* compute the 52bit mantissa of 1/(double)denom */
+static uint64_t softfloat_reciprocal(uint32_t denom)
+{
+    int shift = av_log2(denom-1)+1;
+    uint64_t ret = (1ULL<<52) / denom;
+    uint64_t err = (1ULL<<52) - ret*denom;
+    ret <<= shift;
+    err <<= shift;
+    err += denom/2;
+    return ret + err/denom;
+}
+
+/* (uint32_t)(x*f), where f has the given mantissa, and exponent 0 */
+static uint32_t softfloat_mul(uint32_t x, uint64_t mantissa)
+{
+    uint64_t l = x*(mantissa&0xffffffff);
+    uint64_t h = x*(mantissa>>32);
+    h += l>>32;
+    l &= 0xffffffff;
+    l += 1<<av_log2(h>>21);
+    h += l>>32;
+    return h>>20;
+}
+
 static void lag_memset(uint8_t *s, uint8_t c, size_t n, int step)
 {
     int i;
@@ -143,13 +167,13 @@ static int lag_read_prob_header(lag_rac *rac, GetBitContext *gb)
     scale_factor = av_log2(cumul_prob);
 
     if (cumul_prob & (cumul_prob - 1)) {
-        scale_factor++;
+        uint64_t mul = softfloat_reciprocal(cumul_prob);
         for (i = 1; i < 257; i++) {
-            rac->prob[i] =
-                ((uint64_t) rac->prob[i] << scale_factor) / cumul_prob;
+            rac->prob[i] = softfloat_mul(rac->prob[i], mul);
             scaled_cumul_prob += rac->prob[i];
         }
 
+        scale_factor++;
         cumulative_target = 1 << scale_factor;
 
         if (scaled_cumul_prob > cumulative_target) {