[Ffmpeg-devel] FLAC encoder
Michael Niedermayer
michaelni
Fri May 26 15:14:42 CEST 2006
Hi
On Fri, May 26, 2006 at 03:24:35AM -0400, Justin Ruggles wrote:
> Hello,
>
> My FLAC encoder has progressed to a point that I am happy with. Test it
> out, and see what you think. If the developers show an interest in
> including this in FFmpeg, I will consolidate the files and adapt it for
> inclusion.
yes
requirements for acceptance are at least that the existing crc, rice &
bitstream facilities of libavcodec are used
one suggestions:
instead of
for all partition orders (calc_optimal_partition_order)
for all partitions (calc_optimal_rice_params)
do gradient descent search for k to minimize: (calc_optimal_rice_param)
the number of bits for all symbols in the current partition (rice_encode_count)
which is obviously going to be slow
O(n * partition_orders * gardient_descent_steps), you can
sum up all bits in O(n) time and then run the optimization based on the sums
the result is identical
1. convert everything to unsiged by tmp[i] = (2*data[i]) ^ (data[i]>>31)
which is equivalent to tmp[i] = (data[i] < 0) ? -2*data[i]-1 : 2*data[i];
2. calculate the partial sums (the following is written for readability its
dead slow if implemented like that:
part[0][i][j] = tmp[i] & (1<<j)
part[p][i][j]= part[p-1][2*i][j] + part[p-1][2*i+1][j]
the number of bits for a single symbol is (data>>k) + k + 1 for all symbols
n to m its obviously (m-n)*(k+1) + sum of (data[i]>>k)
and "sum of (data[i]>>k)" is just the sum of the corresponding partial sums
for that area
i hope you understand my suggestion ...
now, how to find the partial sums quickly:
first pass, we want to quickly find the 32 sums of the bits of 2 integers
sounds like it would need a loop with 32 iterations ... no it needs 2
instructions only :)
lsb[i]= d[2i] ^ d[2i+1]
msb[i]= d[2i] & d[2i+1]
pass 2:
sum[1][i][0] = lsb[2i] ^ lsb[2i+1]
sum[1][i][1] = msb[2i] ^ msb[2i+1] ^ (lsb[2i] & lsb[2i+1])
sum[1][i][2] = msb[2i] & msb[2i+1]
pass n:
L= sum[n-1][2i]
R= sum[n-1][2i+1]
sum[n][i][0] = L[0] ^ R[0]
carry[0] = L[0] & R[0]
sum[n][i][j] = L[j] ^ R[j] ^ carry[j-1]
carry[j] = ((L[j] ^ R[j]) & carry[j-1]) | (L[j] & R[j])
sum[n][i][n] = L[n-1] & R[n-1]
each pass will work only with half as many elements as the previous so this
is fast even though each pass is somewhat more complex then the previous
if you dont understand the whole, keep in mind each variable holds 32 bits not
1 bit which are worked on in parallel
alternatively if this is too complex for you then you could at least
move the "(data[i] < 0) ? -2*data[i]-1 : 2*data[i];" and "k + 1" out
of the innermost loop
and cache the partial bit counts (a partition from n..m with param k
needs exactly as many bits as the sum of the bits needed by the 2 smaller
partitions)
[...]
--
Michael
In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is
More information about the ffmpeg-devel
mailing list