[FFmpeg-devel] [PATCH] avcodec/dvenc: support encoding dvcprohd
Baptiste Coudurier
baptiste.coudurier at gmail.com
Sat Nov 2 21:08:46 EET 2019
On Thu, Sep 19, 2019 at 12:34 PM Michael Niedermayer <michael at niedermayer.cc>
wrote:
> On Wed, Sep 11, 2019 at 12:29:57PM -0700, Baptiste Coudurier wrote:
> > ---
> > libavcodec/dv.h | 1 +
> > libavcodec/dvenc.c | 576 ++++++++++++++++++++++++++++++++++++++++-----
> > 2 files changed, 522 insertions(+), 55 deletions(-)
>
> a fate test should be added for this if its not already planed or done
>
I'm having issues with fate on macOS catalina right now :(
> [...]
>
> > + /* LOOP1: weigh AC components and store to save[] */
> > + /* (i=0 is the DC component; we only include it to make the
> > + number of loop iterations even, for future possible SIMD
> optimization) */
> > + for (i = 0; i < 64; i += 2) {
> > + int level0, level1;
> > +
> > + /* get the AC component (in zig-zag order) */
> > + level0 = blk[zigzag_scan[i+0]];
> > + level1 = blk[zigzag_scan[i+1]];
> > +
> > + /* extract sign and make it the lowest bit */
> > + bi->sign[i+0] = (level0>>31)&1;
> > + bi->sign[i+1] = (level1>>31)&1;
> > +
> > + /* take absolute value of the level */
> > + level0 = FFABS(level0);
> > + level1 = FFABS(level1);
> > +
> > + /* weigh it */
> > + level0 = (level0*weight[i+0] + 4096 + (1<<17)) >> 18;
> > + level1 = (level1*weight[i+1] + 4096 + (1<<17)) >> 18;
> > +
> > + /* save unquantized value */
> > + bi->save[i+0] = level0;
> > + bi->save[i+1] = level1;
> > + }
> > +
> > + /* find max component */
> > + for (i = 0; i < 64; i++) {
> > + int ac = bi->save[i];
> > + if (ac > max)
> > + max = ac;
> > + }
>
> these 2 loops can be merged avoiding a 2nd pass
>
Merged
[...]
> > +static inline void dv_guess_qnos_hd(EncBlockInfo *blks, int *qnos)
> > +{
> > + EncBlockInfo *b;
> > + int min_qlevel[5];
> > + int qlevels[5];
> > + int size[5];
> > + int i, j;
> > + /* cache block sizes at hypothetical qlevels */
> > + uint16_t size_cache[5*8][DV100_NUM_QLEVELS] = {{0}};
> > +
> > + /* get minimum qlevels */
> > + for (i = 0; i < 5; i++) {
> > + min_qlevel[i] = 1;
> > + for (j = 0; j < 8; j++) {
> > + if (blks[8*i+j].min_qlevel > min_qlevel[i])
> > + min_qlevel[i] = blks[8*i+j].min_qlevel;
> > + }
> > + }
> > +
> > + /* initialize sizes */
> > + for (i = 0; i < 5; i++) {
> > + qlevels[i] = dv100_starting_qno;
> > + if (qlevels[i] < min_qlevel[i])
> > + qlevels[i] = min_qlevel[i];
> > +
> > + qnos[i] = DV100_QLEVEL_QNO(dv100_qlevels[qlevels[i]]);
> > + size[i] = 0;
> > + for (j = 0; j < 8; j++) {
> > + size_cache[8*i+j][qlevels[i]] =
> dv100_actual_quantize(&blks[8*i+j], qlevels[i]);
> > + size[i] += size_cache[8*i+j][qlevels[i]];
> > + }
> > + }
> > +
> > + /* must we go coarser? */
> > + if (size[0]+size[1]+size[2]+size[3]+size[4] > vs_total_ac_bits_hd) {
> > + int largest = size[0] % 5; /* 'random' number */
> > +
>
> > + do {
> > + /* find the macroblock with the lowest qlevel */
> > + for (i = 0; i < 5; i++) {
> > + if (qlevels[i] < DV100_NUM_QLEVELS-1 &&
> > + qlevels[i] < qlevels[largest])
> > + largest = i;
> > + }
> > +
> > + i = largest;
> > + /* ensure that we don't enter infinite loop */
> > + largest = (largest+1) % 5;
> > +
> > + if (qlevels[i] >= DV100_NUM_QLEVELS-1) {
> > + /* can't quantize any more */
> > + continue;
> > + }
> > +
> > + /* quantize a little bit more */
> > + qlevels[i] += dv100_qlevel_inc;
> > + if (qlevels[i] > DV100_NUM_QLEVELS-1)
> > + qlevels[i] = DV100_NUM_QLEVELS-1;
> > +
> > + qnos[i] = DV100_QLEVEL_QNO(dv100_qlevels[qlevels[i]]);
> > + size[i] = 0;
> > +
> > + /* for each block */
> > + b = &blks[8*i];
> > + for (j = 0; j < 8; j++, b++) {
> > + /* accumulate block size into macroblock */
> > + if(size_cache[8*i+j][qlevels[i]] == 0) {
> > + /* it is safe to use actual_quantize() here because
> we only go from finer to coarser,
> > + and it saves the final actual_quantize() down
> below */
> > + size_cache[8*i+j][qlevels[i]] =
> dv100_actual_quantize(b, qlevels[i]);
> > + }
> > + size[i] += size_cache[8*i+j][qlevels[i]];
> > + } /* for each block */
> > +
> > + } while (vs_total_ac_bits_hd < size[0] + size[1] + size[2] +
> size[3] + size[4] &&
> > + (qlevels[0] < DV100_NUM_QLEVELS-1 ||
> > + qlevels[1] < DV100_NUM_QLEVELS-1 ||
> > + qlevels[2] < DV100_NUM_QLEVELS-1 ||
> > + qlevels[3] < DV100_NUM_QLEVELS-1 ||
> > + qlevels[4] < DV100_NUM_QLEVELS-1));
>
> i think the DV100_NUM_QLEVELS checks can be simplified
>
> If we keep track of how many qlevels are < DV100_NUM_QLEVELS-1
> The check in the first loop is then not needed because if
> there is one that is smaller than that will be found and
> no need to check each against DV100_NUM_QLEVELS-1
>
> The smallest then being checked again against DV100_NUM_QLEVELS-1 also
> becomes unneeded
>
> and at the end the 5 checks in the while() can then be changed to a
> single check on the new variable
>
> This should make the code both faster and simpler
>
Updated, please check :)
Patch updated
--
Baptiste
More information about the ffmpeg-devel
mailing list