[FFmpeg-devel] [PATCH] lavu/lfg: add 64 bit random number generator

Tue Mar 15 01:42:32 CET 2016

On Sun, Mar 13, 2016 at 11:08 PM, Michael Niedermayer
<michael at niedermayer.cc> wrote:
> On Sun, Mar 13, 2016 at 07:12:50PM -0400, Ganesh Ajjanagadde wrote:
>> This is based on the relatively well known xorshift128+ of Sebastiano
>> Vigna (https://en.wikipedia.org/wiki/Xorshift) that performs very well on the
>> BigCrush suite, is very efficient, and is thus used by a number of
>> clients: http://xorshift.di.unimi.it/ (see Introduction).
>>
>> Moreover, the implementation is in the public domain:
>> http://xorshift.di.unimi.it/xorshift128plus.c.
>>
>> Concretely, it is nearly as fast as av_lfg_get (which only returns 32 bits),
>> and has a much smaller cache (128 bits). Thus, the timings should also
>> be more stable.
>>
>> This is needed because av_lfg_get<<32 | av_lfg_get is far slower, and
>> likely less random as measured by BigCrush - most LFG's perform
>> quite poorly on the BigCrush suite:
>> http://www6.tw.freebsd.org/distfiles/testu01.pdf.
>> In particular, FFmpeg's seems to be Ran055 in the paper, see pg31.
>>
>> Sample benchmark (Haswell, GCC + -march=native):
>>   23200 decicycles in 624 calls of av_lfg_get,       1 runs,      0 skips
>>   23040 decicycles in 624 calls of av_lfg_get,       2 runs,      0 skips
>>   22810 decicycles in 624 calls of av_lfg_get,       4 runs,      0 skips
>> [...]
>>   19884 decicycles in 624 calls of av_lfg_get,   65532 runs,      4 skips
>>   19880 decicycles in 624 calls of av_lfg_get,  131067 runs,      5 skips
>>   19884 decicycles in 624 calls of av_lfg_get,  262136 runs,      8 skips
>>   19877 decicycles in 624 calls of av_lfg_get,  524276 runs,     12 skips
>>
>>   25380 decicycles in 624 calls of av_rand64_get,       1 runs,      0 skips
>>   28560 decicycles in 624 calls of av_rand64_get,       2 runs,      0 skips
>>   30112 decicycles in 624 calls of av_rand64_get,       4 runs,      0 skips
>> [...]
>>   22627 decicycles in 624 calls of av_rand64_get,   65536 runs,      0 skips
>>   22625 decicycles in 624 calls of av_rand64_get,  131072 runs,      0 skips
>>   22625 decicycles in 624 calls of av_rand64_get,  262143 runs,      1 skips
>>   22624 decicycles in 624 calls of av_rand64_get,  524286 runs,      2 skips
>>
>> Signed-off-by: Ganesh Ajjanagadde <gajjanag at gmail.com>
>> ---
>>  libavutil/lfg.c | 33 ++++++++++++++++++++++++++++++++-
>>  libavutil/lfg.h | 26 ++++++++++++++++++++++++++
>>  2 files changed, 58 insertions(+), 1 deletion(-)
>
> why do you add this code to lfg.* (Lagged Fibonacci Generator)?
> its not a lfg

I just wanted to lay out the implementation, it was easier for me to
test and compare that way. Also, don't really know if it should be
public or private. I really felt that all random number stuff should
go into e.g lavu/rand.h, right now it is quite scattered across
random_seed.c, lfg.h (which is not great from a discoverability
standpoint).

Anyway, if you are flexible about where it could go, I will resubmit
as lavu/rand.h, lavu/rand.c containing this 64 bit gen. Is this fine
with you?

Some day or the other this can be cleaned up into e.g a single header,
but that maybe best handled later.

>
> also the LFG could be trivially extended/changed to 64bit if one wants
> only needs uint64_t being used

I can believe the extendability, but note that I do not know about the
quality of such generators. The linked TestU01 paper did not seem too
happy with the additive lagged Fibonacci generators, and instead
favored multiplicative ones like av_mlfg_get.

>
> also theres av_mlfg_get() which passes all tests, though slower of
> course

Are you sure it passes all of BigCrush? What test suite are you referring to?

>
> and does xorshift128+ really pass all tests ?

This was a claim on wikipedia that is I believe incorrect, Vigna
himself makes no such absolute claims. All he claims is (from
http://xorshift.di.unimi.it/xorshift128plus.c):
"

/* This is the fastest generator passing BigCrush without
   systematic failures, but due to the relatively short period it is
   acceptable only for applications with a mild amount of parallelism;
   otherwise, use a xorshift1024* generator. */
"
As such, he does not rule out all possible failures.
In fact, his page http://xorshift.di.unimi.it/#quality suggest certain failures.

 I therefore did not mention explicit absolute claims in the message either.

> what if the bits are reversed so that the least significant and most
> significant are exchanged ?
> the text seems unclear about that

Not sure exactly which text you are referring to, I did not look at
this in too much depth because it is claimed on the page that Chrome,
Firefox, and others use it - I doubt they would have hunted down and
used this if this was bad.

>
> anyway, iam fine with the addition of xorshift128plus but please
> put it in a different file
> above questions are more due to curiousity and not a objection

Thanks, unfortunately I am not a PRNG expert and have not dug too
deeply into this.

>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> I am the wisest man alive, for I know one thing, and that is that I know
> nothing. -- Socrates
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>