[FFmpeg-devel] [PATCH] libavcodec: add bit-rate support to RoQ video encoder

Tomas Härdin git at haerdin.se
Wed Jan 24 23:29:52 EET 2024


ons 2024-01-24 klockan 11:50 +0300 skrev Victor Luchitz:
> On Tue, Jan 23, 2024 at 8:44 PM Tomas Härdin <git at haerdin.se> wrote:
> > 
> > Anyway, using -b:v 100k causes the encoder to effectively become
> > stuck
> > on the first frame, being unable to go below 621 kbps and
> > increasing
> > qscale very slowly. But you mentioned this already of course.
> > Perhaps
> > there should be a faster "startup" phase? Subsequent frames being
> > P-
> > frames may lead to the average hitting the target bitrate.
> > 
> 
> I'm not sure we'd go as far as special-casing the encoder for
> extremely low
> bit-rates at high image resolutions. I mean, if you want low picture
> quality,
> the proper way for RoQ would be to scale down your video to a lower
> resolution
> instead of turning the video into a huge ugly mess of macroblocks.

I'm not really suggesting such low bitrates is a good idea, but they
shouldn't cause the encoder to choke. What happens if there's suddenly
a frame that's very complex, say white noise?

> The main issues with allowing a larger keyframe and then trying to
> make it
> up on subsequent frames are: doable for FIRST frame, may cause buffer
> underrun on later keyframes; we are targeting CDROM media with this
> patch,
> be it 1X, 2X, whatever. We cannot have too much variance in the frame
> sizes
> (bigger). Smaller is allowable, but not bigger.

How bursty is CD-ROM anyway? I suspect something similar to VBV would
help here. Surely there are buffers involved?

> It's OK if encoding takes some time :)

Yeah but it shouldn't hang

> 
> 
> > 
> > I'm also curious about this hunk:
> > 
> > > -        if (enc->lambda > 100000) {
> > > +        if (enc->lambda > 100000000) {
> > >              av_log(roq->logctx, AV_LOG_ERROR, "Cannot encode
> > > video in
> > Quake compatible form\n");
> > >              return AVERROR(EINVAL);
> > >          }
> > 
> > Where in the Quake (3?) source code is this limitation? Seems
> > rather
> > that there should be a retry limit. There's probably no harm in
> > keeping
> > going so long as the packet doesn't end up above the limit.
> > 
> 
> In our tests we found that lambda can go way higher than the original
> 100000. Stopping at 100000 is artificially restricting yourself when
> you
> can go much further. 100000000 is high enough to allow 1X  bitrates
> even on extreme frames.

I figured as much. But this could also be arrived at with some formula
I suspect. A higher limit isn't wrong per se.

> > A quick logarithmic regression on bitrate vs qscale suggests it
> > scales
> > with somewhere between qscale^-0.2833 and qscale^-0.2842.
> > Conversely,
> > to hit a specific bitrate, try scaling qscale with the bitrate
> > ratio
> > raised to around 3.5. A more conservative exponent like 2 is
> > probably
> > also fine. See patch attached.
> > 
> 
> Thanks so much for the patch! I've tested it on a pretty long hires
> video at
> 835kbps and everything seems to work fine. The runtime was 2m39s for
> both: the original and your version.

Yep. But, the fact that I could get it to hang at certain points
doesn't fully instill confidence.

A better approach might be to buffer an entire GOP, then find a lambda
that brings the entire GOP within the desired bitrate. If the encoder
happens to insert an extra I-frame, emit the I and P frames up to that
point and buffer up enough frames to make another GOP. Say with a GOP
size of 6, it would normally look like:

  [IPPPPP][IPPPPP]...

But sometimes like this, if nothing special is done:

  [IPPPIP][IPPPPP]...

This is probably wasteful however, so better to do

  [IPPP][IPPPPP][IP...

That second I-frame would probably have been quite cheap to fit within
the allotted bits. It might for example be a cut to white and fade in.

If I understand this patch correctly, it tries to make every packet
roughly the same size, which seems excessive. Does the machine you're
targetting not have enough RAM to buffer an entire CD-ROM? Or even a
couple of seconds?

/Tomas


More information about the ffmpeg-devel mailing list